December 21, 2018

ODA Server Patch 18.3 available - and installed ...

Since today, Oracle Database Appliance software release 18.3.0.0.0 is available ... I took the chance to download and install the new release - this is the story:

Initial Situation:

I tested the update on an ODA X7-2M with system version 12.2.1.4.0, running five databases (12.1). There are three vlans configured on that system.

Documentation used:

Installation Instruction was taken from docs.oracle.com, Oracle Database Appliance, Release 18.3, X7-2 Deployment and User's Guide for Linux x86-64, Chapter 7 Patching Oracle Database Appliance, subchapter 'Patching Oracle Database Appliance Using the CLI'

Remarks:

- You have to download about 15GB - just for the system update, DB 18.x is around 4GB
- do not copy the update software to a NFS share and do not run the update from this directory
- if You are using NFS shares, be sure that a df -h comes back immediately. With my testsystem I had two NFS Filesystems which were not reachable at all, resulting in a 'never-come-back' df -h - this, in turn, resulted in a stuck Server Patching at step 'Configuring GI'. After I umounted those FS (by using the -f switch), the patching continued
- qosmserver has to run and a crs resource ora.net1.network has to be configured
- the patch documentation is missing the hint to first update dcsagent (odacli upgrade-dcsagent -v 18.3.0.0.0), but it's a prerequisite
- remove any 'non-default' entries from oracle's .bash_profile prior to the update
- Step 'Setting AUDIT SYSLOG LEVEL' had a status 'Failure' at the end of the update process - overall status of the update was - nevertheless - successful (will check the reason for that failure later)
- Installation documentation says after step 'apply the server update' to check if the update was successful and then to issue an 'odacli update-storage'. ATTENTION! After a successful update, the server will reboot! Wait until the system is up again before You start the update-storage command!
- Server reboot took about 10 Minutes with my X7-2M
- Documentation example for 'Update the storage components' is not correct: use '--rolling' or '-r' instead of '-rolling'
- update-storage using the '--rolling' option is not supported on ODAlite

That's it

The whole process took slightly more than one hour - and remember: the system reboots after an successful 'odacli update-server' (!)



Share:

December 7, 2018

ODA - Update to 12.2.1.4 - issue with ora.net1.network

Yesterday, I had to update an ODA from Release 12.1.2.12 to 12.2.1.4 . All prerequisite checks including prepatchreport were successful ... but the update itself not!

During the update, at stage 'GI Home Cloning', the update process failed and left the system in an 'unknown' state. A detailled analysis has shown the root cause: crs service qosmserver is missing. So, I tried to setup this missing link to a successful update:

srvctl add qosmserver
srvctl status qosmserver
srvctl start qosmserver

But, qosmserver could not be started - resource 'ora.net1.network' is missing. Well, that's absolutely true - we do not have such a resource defined for our ODAs. Finally changed the first of our vlan definitions to ora.net1.network and could start qosmserver.

Unfortunately and because of the interrupted update, GI is now using its 12.2 home, but odacli describe-component shows, that the current home is the old 12.1 - that finally results in a situation where I cannot run update-server again (GI Home Cloning fails, because the new GI Home already exists)

Outcome: As additional prerequisite, check, wether or not the resource ora.net1.network exists on Your ODA and be sure qosmserver is up and running BEFORE starting an 'odacli update-server'.



Share:

November 7, 2018

DBID Problem @ ODA - or Count von Count (Sesame Street) calculates DBIDs

I did some more investigations to get an explanation for that problem with the extra long DBIDs and the resulting problems when creating new databases on an ODA ...

And here's the simple result:

Remember? In our environment, oracle's .bash_profile runs an additional script, which prints all - at this very moment existing - oracle databases, listeners, their status and the according ORACLE_HOMEs to STDOUT. Output looks like this:

######################################################
# Following Databases/Aliases are on the ODA oda-o1234
######################################################
================================================================================
SID/PROCESS         STATUS  E_ORACLE_HOME
--------------------------------------------------------------------------------

+ASM1               up      /u01/app/12.2.0.1/oracle
TEST1               up      /u01/app/oracle/product/12.1.0.2/dbhome_1
TEST2c              up      /u01/app/oracle/product/12.1.0.2/dbhome_1
TEST2D              up      /u01/app/oracle/product/12.1.0.2/dbhome_1
TEST2E              up      /u01/app/oracle/product/12.1.0.2/dbhome_1

LISTENER            up      /u01/app/12.2.0.1/oracle

Now, take all numbers from this output, put them in one line, add the 'real' DBID from the database - and You'll have the dbid which is displayed (and used) by odacli in the internal derby database. :-)

The easy test:

Add two 'echo' lines to oracle's .bash_profile that it looks like this:

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
        . ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/bin

export PATH
umask 022

echo "1111111111111111111111111111111111111111111111111"
echo "2222222222222222222222222222222222222222222222222"

Create a database using the 'odacli create-database' command:

odacli create-database -m -dh 665b9ce4-3a13-4a04-8ccd-548e66556def -n TEST2F -r ACFS

When the database is created, issue an 'odacli list-databases' and, subsequently, an 'odacli describe-database -i <id of that database>' - check the DBID (concatenated echo "1"s plus the echo "2"s plus the 'real' DBID):

Database details
----------------------------------------------------------------
                     ID: 2a5426bc-8db9-4d07-ba87-b54f06dbbcc3
            Description: TEST2F
                DB Name: TEST2F
             DB Version: 12.1.0.2
                DB Type: Si
             DB Edition: EE
                   DBID: 111111111111111111111111111111111111111111111111122222222222222222222222222222222222222222222222222498916505
Instance Only Database: false
                    CDB: false
               PDB Name:
    PDB Admin User Name:
                  Class: OLTP
                  Shape: Odb1
                Storage: ACFS
           CharacterSet: AL32UTF8
  National CharacterSet: AL16UTF16
               Language: AMERICAN
              Territory: AMERICA
                Home ID: 665b9ce4-3a13-4a04-8ccd-548e66556def
        Console Enabled: false
     Level 0 Backup Day: Sunday
    AutoBackup Disabled: false
                Created: November 7, 2018 12:58:45 PM CET
         DB Domain Name: world

The database's DBID - which is the last portion in odacli's DBID:

SQL> select dbid from v$database;

      DBID
----------
2498916505

Imho, the answer for question 'Bug or Feature?' is quite clear ;-) And, by the way, Oracle Support is still investigating the problem.



Share:

October 20, 2018

Update! - Unexpected Behaviour With V$DIAG_ALERT_EXT

Oops! Errormessages from a Test database in my Production-DB?!?


Story behind:

I wanted to know which "ORA-" Errors have occured in my Production-DB and issued a
select

  to_char(originating_timestamp,'DD.MM.YYYYHH24:MI:SS')

  ,message_text

from v$diag_alert_ext
where message_text like '%ORA-%'
and originating_timestamp > sysdate-31
order by originating_timestamp;
to have an overview about the current - let me say - error situation. So far - so good ...

Yesterday, I've found messages from a TEST database when querying the PRODUCTION db, which made me uncertain:
"Errors in file /u01/app/oracle/diag/rdbms/TEST/TEST/trace/TEST_j000_45562.trc"

Checked that - and discussed it with some collegues ... outcome is:

The View V$DIAG_ALERT_EXT does not only contain information from the current DB but from all databases on that system / from that diagnostic_dest:
select  distinct component_id,filename
from v$diag_alert_ext
order by 1,2;
COMPONENT_ID FILENAME
------------ -----------------------------------------------------------------------
apx          /u01/app/oracle/diag/apx/+apx/+APX1/alert/log.xml
asm          /u01/app/oracle/diag/asm/+asm/+ASM1/alert/log.xml
clients      /u01/app/oracle/diag/clients/user_oracle/host_4163035053_107/alert/log.xml
clients      /u01/app/oracle/diag/clients/user_oracle/host_4163035053_82/alert/log.xml
crs          /u01/app/oracle/diag/crs/host/crs/alert/log.xml
rdbms        /u01/app/oracle/diag/rdbms/dbtest1/DBTEST1/alert/log.xml
rdbms        /u01/app/oracle/diag/rdbms/dbtest10/DBTEST10/alert/log.xml
rdbms        /u01/app/oracle/diag/rdbms/dbtest11/DBTEST11/alert/log.xml
rdbms        /u01/app/oracle/diag/rdbms/dbtest12/DBTEST12/alert/log.xml
rdbms        /u01/app/oracle/diag/rdbms/dbtest13/DBTEST13/alert/log.xml
rdbms        /u01/app/oracle/diag/rdbms/dbtest14/DBTEST14/alert/log.xml
rdbms        /u01/app/oracle/diag/rdbms/dbtest2/DBTEST2/alert/log.xml
rdbms        /u01/app/oracle/diag/rdbms/dbtest3/DBTEST3/alert/log.xml
rdbms        /u01/app/oracle/diag/rdbms/dbtest4/DBTEST4/alert/log.xml
rdbms        /u01/app/oracle/diag/rdbms/dbtest5/DBTEST5/alert/log.xml
rdbms        /u01/app/oracle/diag/rdbms/dbtest6/DBTEST6/alert/log.xml
rdbms        /u01/app/oracle/diag/rdbms/dbtest7/DBTEST7/alert/log.xml
rdbms        /u01/app/oracle/diag/rdbms/dbtest8/DBTEST8/alert/log.xml
rdbms        /u01/app/oracle/diag/rdbms/dbtest9/DBTEST9/alert/log.xml
rdbms        /u01/app/oracle/diag/rdbms/rocrtest/rocrtest/alert/log.xml
tnslsnr      /u01/app/oracle/diag/tnslsnr/host/asmnet1lsnr_asm/alert/log.xml
tnslsnr      /u01/app/oracle/diag/tnslsnr/host/listener/alert/log.xml
tnslsnr      /u01/app/oracle/diag/tnslsnr/host/listener_511/alert/log.xml
tnslsnr      /u01/app/oracle/diag/tnslsnr/host/listener_576/alert/log.xml
tnslsnr      /u01/app/oracle/diag/tnslsnr/host/listener_580/alert/log.xml

Well - that was unexpected ... 

Question is: Bug or feature? I'll open a SR and keep You up to date ...
By the way: Made these tests on different oracle releases: 11.2, 12.1, 12.2.

Update (November 14th, 2018)

Opened a Service Request - and Oracle Support brought an explanation:

"We have verified it. It is an expected scenario. As stated earlier V$DIAG_ALERT_EXT displays trace file and alert file data for the current container (It means all PDB) in a CDB (Even though you connect as ALTER SYSTEM SET CONTAINER - it will show all the database because all database resides on ORACLE HOME). 

Actually, V$DIAG_ALERT_EXT read the logs of all databases and listeners from the ADR Location (ORACLE HOME Directory) So, one connection to a database is enough to see all the database alert files and listener logs registered inside the ADR structure. 

So, this is an expected one. And not a BUG! 
"


Share:

REASON FOUND: DBID Problem @ ODA

Hi There!

in my blogpost "http://robertcrames.blogspot.com/2018/10/oda-bug-and-poor-support-from-oracle.html" I described a problem with the impossibility of creating databases as a result of an incredebly long DBID and the 'poor' oracle support in that specific case.

Well - I contacted Oracle support and had a call with the Support Manager and then ... it worked smoothly 😏. Provided a few more information and Oracle support had an idea:

'Remove any changes from the .bash_profile and try again' ... Unfortunately, we made no changes on root's .bash_profile (which is the 'initiator' of the create database process) :-( - but we made some changes on oracle's .bash_profile. One of these changes was, to set oracle's primary environment to Grid Infrastructure' instance +ASM1. Removed that entry, retried the 'odacli create-database' - and ... it works - DBID is correct and plausible, everything runs fine. :-)

Share:

October 12, 2018

DBID Problem @ ODA ... Bug (?)

long story told short: 


  • Oracle Database Appliance X6-2M and X7-2M, 
  • oak 12.1.2.11 up to 12.2.1.4

We were not able to create databases anymore on one of our 15 ODAs. Detailled analysis has shown that the dbid - result from an 'odacli describe-database -i <DB identifier>' - grows and grows and, when it reaches a size of more than 255 characters, any subsequent try to create a database will fail and leave the database in status 'creating'. No more databases can be created, neither can You delete the database which is left in status 'creating'.


This is what a correct DBID (result from a describe-database) looks like: 1653395885 - btw: this is exactly the real dbid found in the database (select dbid from v$database).

An - imho - 'corrupted' DBID looks like this: 2791133100113200112102253200112102127320011210212832001121021533200112102154320011210215632001121021573200112102159320011210217323200112102177232001121021793200112102181320011210218323200112102186320011210215800112102576011210251101121021196170968 - the last 'n' numbers represent the real dbid (select dbid from v$database).

Side effect, when the database creation has failed: 
odacli list-databases
DCS-10001:Internal error encountered: could not execute query.

Would be great if You could do this:

  • If You have more than five databases on an ODA, issue an 'odacli list-databases'
  • take the id from the last created database and issue an 'odacli describe-database -i <id>'
  • Post the result as a comment in this blog

Regarding oracle support:

The Service Request is now open since weeks. Started as a Sev 2, relatively early changed sev to 1. The SR is updated from oracle support very rarely, usually no answer at all when I update the SR. In the meantime I escalated the SR, asked for a recall from a manager - no result and no recall. Today asked for another recall from another manager - will see how long that will take. 

@OracleSupport: that is very poor ... leaving customers helpless and uninformed with such severe Problems. Very disappointing ...
Share:

August 8, 2018

ODA X7 – no network connect after setup / re-image


Yesterday, I’d prepared an all new X7-2M. During the setup process / network configuration (using VLAN), I ran into a problem: Despite the network configuration was setup absolutely correct, any try to connect to the system from a jumphost was unsuccessful. 

Long story short: If You want to use the SFP28 (Fiberchannel) ports an ODA X7 offers, You have to do some config changes, to make it work. 
First of all: do a firmware upgrade to version 20.08.01.14 or above. 
Next, configure the nic / nics to force the speed to 10000 and turn off autonegotiation. 

The firmware as well as a detailed description of all necessary steps can be found in MOS Note 2373070.1 ‘Using the onboard SFP28 ports on an ODA X7-2 server node’
Share:

How to copy Files to an ODA which is not available via network - by using the network

Sounds a little weird, I know - but works ... *lol*

Problem was: 

The all new ODA has installed old firmware for the SPF28 interfaces. This makes it impossible to connect the system to the network. To upgrade the interfaces You need to have the firmware files on the ODA ... but how can I copy the firmware to the ODA without a network connect?

Solution:

First of all: create an ISO File from the Files You want to have in a virtual cd. With a mac, this is quite easy to do: start disk utility, then 'File / New Image / Image from Folder' and follow the instructions. Disk Utility will create a dmg file which hast to be reamed to iso. That's it, basically.

Next steps:

Open a browser window an connect to the ILOM (https://<ilom's hostname or ip>
Login by using ILOM's root account














Launch console


Open KVMS/Storage 





























Add Storage - choose the iso you'd created and klick 'Connect'




Next, in a terminal session, start the console application (start /HOST/console), and mount the cdrom: 

mkdir /media/cdrom
mount /dev/cdrom /media/cdrom

copy the files to whereever You want on the ODA's file system :-)


Share:

April 4, 2018

Enterprise Manager and the disadvantages of WLS' Dynamic Monitoring Service (lots of metricdump files)

Had an interesting problem this morning with a customers Windows Server where Enterprise Manager 12c is running: Disk C: - where the EM Software is installed - was full. By the way: Imho, oracle software should never ever be installed in C: ...

The Problem:

Reason for the full disk was Weblogic's Dynamic Monitoring Service (or a weak housekeeping tool ;-)).
Explained:
Weblogic has a feature, called DMS, which is checking some domain metrics and saves them to a file in directory (DOMAIN_HOME/server/<ServerName>/logs/metrics) for each managed server. Each file is about 600 to 800k in size, which is not that much - except You have thousands of those files. Filename is, by the way, somewhat like metricdump*. The files are created per default in a 3 hour cycle.

The Solution:

For this customer's system, I decided to stop the Dynamic Monitoring Service for the AdminServer as well as for the EMGC_OMS1 managed server. To achieve this, search for the file dms_config.xml (DOMAIN_HOME/config/fmwconfig/servers/<ServerName>) and change the value of 'enabled' to "false"

<dumpConfiguration>
  <dump intervalSeconds="10800" maxSizeMBytes="75" enabled="false"/>
</dumpConfiguration>

Restart EM to activate the change. 
Another option could be, to change the value of intervalSeconds to a higher value - for example once a day (86'400 secs) or twice a day (43'200 secs). Or, in case you have a proper working housekeeping tool running each day, configure it that it deletes all files older than 'whatever-you-might-think'.


Share:

March 16, 2018

ODA X6-2M – Expanding Storage can be unsuccessful when impatient



In 'A Brief History of Time' (RIP Mr. Hawking), Chapter 2 is named 'Space and Time' – and this title fits perfect with a current problem, I was faced with, when expanding disk capacity in an ODA X6-2M.

Long story, told short

An ODA X6-2M, equipped with the standard disk configuration, ran out of space and my customer decided to add two NVMe's. So, he ordered two NVMe's – and, by the way - we were waiting for two weeks (!), until delivery. 

Next step - mounting the disks. According documentation, expanding disk storage is a quite easy job:
  • Put the disks in
  • Set the disks in online state (odaadmcli power disk on pd_02 and odaadmcli power disk on pd_03)
  • Expand storage (odaadmcli expand storage)

This results in my environment in a complete disaster:

  • Disks are already online when trying a 'power disk on',
  • 'expand storage' meant, the disks have already an ASM signature and aborted,
  • 'odaadmcli show disk' shows the disks, but als UNKNOWN and with the /dev/nmv* names of the existing NMVes, whilst the old ones had, all of a sudden, new names … (!)
  • The command to power off said that the disks are not online, but a subsequent remove of the disks resulted in a system crash …

In short: horror!! Opened a Service Request, tried this, tried that – and finally, after three weeks fighting with Oracle Support, I started a last try: Put the disks in, went to the coffee bar, returned about an hour later. Started to set the disks in online state (result: the disks are already online), and issued an 'odaadmcli expand storage' – expecting the same result as with every try before … but, surprisingly, the command returned no error, ASM started rebalancing and everything went fine J


Solution / Chapter 2 ... 'Space and Time'

You'll get the disk space, when You take Your time ...

Put the disks in, wait at least 15 minutes, power on the disks and issue the 'expand storage' command. Oracle Supports explaination of this behavior was (in my words): The oakd process needs some time to prepare the disks in a way that expand storage works properly …

Share:

ODA Xx – Space Waste When Using ACFS (instead of ASM) For Databases



Disclaimer: I will not discuss the necessity of an ACFS FS to store database files, nor compare ASM and ACFS or it's benefits. What to use / what fits best in Your environment is up to You, folks … ;-)

Pretty sure that this is for lots of people a well-known problem – but for all others:

When creating a database using ACFS as storage option, a FS of size 100GB is created. This 100GB is not a configurable value, by the way. Result is: When having lots of small databases, making use of ACFS wastes a lot of disk space, as every database's file system occupies 100GB.
Oracle Support's answer to that is: "In the future releases you may see the customized option to define the size of ACFS while creating the database". Nevertheless: It is possible to reduce that 100GB FS to an appropriate size. Here's how:  

Examples are based on OAK 12.2.1.2.0, used db release: 12.1.0.2

Create the database:

oracle@eoda03 ~]$ odacli create-database -m -n RCRACFS1 --no-cdb  --dbstorage ACFS -dh cce28d1b-01cc-4917-8237-38683d34f53e

… results in (from a FS perspective)

Filesystem                   Size  Used Avail Use% Mounted on
/dev/asm/datrcracfs1-262     100G  2.5G   98G   3% /u02/app/oracle/oradata/RCRACFS1

Reduce the filesystem size to 10GB:

[oracle@eoda03 ~]$ acfsutil size -90G /u02/app/oracle/oradata/RCRACFS1
acfsutil size: new file system size: 10737418240 (10240MB)

Check the filesystem size

[oracle@oda03 ~]$ df -h /u02/app/oracle/oradata/RCRACFS1
Filesystem                   Size  Used Avail Use% Mounted on
/dev/asm/datrcracfs1-262      20G  2.5G   18G  13% /u02/app/oracle/oradata/RCRACFS1

20G?? 100 minus 90 is 20? Possibly yes (you never know ;-)) – on the other hand and a better explanation: 20GB could be minimum FS for an ACFS file system.

Conclusion:

Yes! It is that easy J

Share:

March 5, 2018

EXPDP using an External Password Store - facing a new 'release' of a well known performance issue

After a while without blogging, here a new blog post talking about a very well known performance issue:

export (exp as well as expdp) is usually slower when using an TCP based TNS-Alias instead of setting ORACLE_SID. Remember? ;-) 

In a little more detail:
An "expdb system/manager@db directory= ..." takes (usually) more time than an
"export ORACLE_SID=DB; expdp system/manager directory= ..."

But what if You have to use an external password store - a wallet - to avoid clearly readable passwords either in a file or at the command line? One part of the whole procedure is, to define the TNS alias in tnsnames.ora - and most of us define a TCP based alias. This is - imho - not the best way of connecting a local database - IPC or BEQ are way better for that.

So, I solved an expdp performance issue by using an Bequeath based TNS alias:

Created a new wallet:
mkstore -wrl /u01/app/oracle/wallet -create

Created an alias to connect to the DB:
mkstore -wrl /u01/app/oracle/wallet -createCredential db_system_beq.world system manager

Added a BEQ connect description to my tnsnames.ora:
DB_SYSTEM_BEQ.WORLD =
  (DESCRIPTION =
    (ADDRESS =
          (PROTOCOL = BEQ)
          (PROGRAM = oracle)
          (ARGV0 = oracleDB)
          (ARGS = '(DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=BEQ)))')
    )
    (CONNECT_DATA =
        (SERVICE_NAME = DB)
    )

  )

Set the environment for my db and issued an expdp:
export ORACLE_SID=DB; expdb /@db_system_beq.world directory= ...

Result:

Job "SYSTEM"."SYS_EXPORT_FULL_01" successfully completed at Mon Mar 5 15:04:15 2018 elapsed 0 00:00:43 

Result when using a TCP based TNS alias:

Job "SYSTEM"."SYS_EXPORT_FULL_01" successfully completed at Mon Mar 5 14:58:03 2018 elapsed 0 00:01:31 

Try it - and post Your results as comment.


Share:

Copyright © Robert Crames' Oracle Database and Middleware Blog | Powered by Blogger
Design by SimpleWpThemes | Blogger Theme by NewBloggerThemes.com