Quantcast
Channel: oracle – Svetoslav Gyurov Oracle blog
Viewing all 76 articles
Browse latest View live

Oracle TNS-12535 and Dead Connection Detection

$
0
0

These days everything goes to the cloud or it has been collocated somewhere in a shared infrastructure. In this post I’ll talk about sessions being disconnected from your databases, firewalls and dead connection detection.

Changes

We moved number of 11g databases from one data centre to another.

Symptoms

Now probably many of you have seen the following error in your database alertlog “TNS-12535: TNS:operation timed out” or if you haven’t you will definitely see it some day.

Consider the following error from database alert log:

Fatal NI connect error 12170.

VERSION INFORMATION:
TNS for Linux: Version 11.2.0.3.0 - Production
Oracle Bequeath NT Protocol Adapter for Linux: Version 11.2.0.3.0 - Production
TCP/IP NT Protocol Adapter for Linux: Version 11.2.0.3.0 - Production
Time: 12-MAR-2015 10:28:08
Tracing not turned on.
Tns error struct:
ns main err code: 12535

TNS-12535: TNS:operation timed out
ns secondary err code: 12560
nt main err code: 505

TNS-00505: Operation timed out
nt secondary err code: 110
nt OS err code: 0
Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.10)(PORT=49831))
Thu Mar 12 10:28:09 2015

Now this error indicate timing issues between the server and the client. It’s important to mention that those errors are RESULTANT, they are informational and not the actual cause of the disconnect. Although this error might happen for number of reasons it is commonly associated with firewalls or slow networks.

Troubleshooting

The best way to understand what’s happening is to build a histogram of the duration of the sessions. In particular we want to understand whether disconnects are sporadic and random or they follow a specific pattern.

To do so you need to parse the listener log and locate the following line from the above example:

(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.10)(PORT=49831))

Since the port is random you might not get same record or if you do it might be days apart.

Here’s what I found in the listener:

12-MAR-2015 08:16:52 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=ORCL)(CID=(PROGRAM=app)(HOST=apps01)(USER=scott))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.10)(PORT=49831)) * establish * ORCL * 0

In other words – at 8:16 the user scott established connection from host 192.168.0.10.

Now if you compare both records you’ll get the duration of the session:

Established: 12-MAR-2015 08:16:52
Disconnected: Thu Mar 12 10:28:09 2015

Here are couple of other examples:
alertlog:

Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.10)(PORT=20620))
Thu Mar 12 10:31:20 2015 

listener.log:

12-MAR-2015 08:20:04 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=ORCL)(CID=(PROGRAM=app)(HOST=apps01)(USER=scott))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.10)(PORT=20620)) * establish * ORCL * 0 

alertlog:

Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.10)(PORT=48157))
Thu Mar 12 10:37:51 2015 

listener.log:

12-MAR-2015 08:26:36 * (CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=ORCL)(CID=(PROGRAM=app)(HOST=apps01)(USER=scott))) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.10)(PORT=48157)) * establish * ORCL * 0 

alertlog:

Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.11)(PORT=42618))
Tue Mar 10 19:09:09 2015 

listener.log

10-MAR-2015 16:57:54 * (CONNECT_DATA=(CID=(PROGRAM=)(HOST=__jdbc__)(USER=root))(SERVICE_NAME=ORCL1)(SERVER=DEDICATED)) * (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.0.11)(PORT=42618)) * establish * ORCL1 * 0 

As you may have noticed the errors follow very strict pattern – each one gets disconnect exactly 2hrs 11mins after it has been established.

Cause

Given the repetitive behaviour of the issue and that it happened for multiple databases and application servers we can conclude that’s definitely a firewall issue.

The firewall recognizes the TCP protocol and keeps a record of established connections and it also recognizes TCP connection closure packets (TCP FIN type packet). However sometimes the client may abruptly end communication without closing the end points properly by sending FIN packet in which case the firewall will not know that the end-points will no longer use the opened channel. To resolve this problem firewall imposes a BLACKOUT on those connections that stay idle for a predefined amount of time.

The only issues with BLACKOUT is that neither or the sides will be notified.

In our case the firewall will disconnect IDLE sessions after around 2hrs of inactivity.

Solution

The solution for database server is to use Dead Connection Detection (DCD) feature. DCD detects when a connection has terminated unexpectedly and flags the dead session so PMON can release the resources associated with it.

DCD sets a timer when a session is initiated and when the timer expires SQL*Net on the server sends a small 10 bytes probe packet to the client to make sure connection is still active. If the client has terminated unexpectedly the server will get an error and the connection will be closed and the associated resources will be released. If the connection is still active then the probe packet is discarded and the timer is reset.

To enable DCD you need to set SQLNET.EXPIRE_TIME in sqlnet.ora of you RDBMS home!

cat >> $ORACLE_HOME/network/admin/sqlnet.ora
SQLNET.EXPIRE_TIME=10 

This will set the timer to 10 minutes. Remember that sessions need to reconnect for the change to take place it won’t work for existing connections.

Firewalls become smarter and they can now inspect packages even deeper. Make sure the following settings are also disabled:
– SQLNet fixup protocol
– Deep Packet Inspection (DPI)
– SQLNet packet inspection
– SQL Fixup

I had similar issue with Dataguard already, read more here – Smart Firewalls

How to test Dead Connection Detection

You might want to test or make sure DCD really works. You’ve got multiple options here – Oracle SQL client trace, Oracle SQL Server Trace, Sniff the network with packet analyzer OR using strace to trace the server process. I used strace since I had access to the database server and it was non intrusive.

1. Establish a connection to the database through SQL*Net

2. Find the processes number for your session:

SQL>  select SPID from v$process where ADDR in (select PADDR from v$session where username='SVE');

SPID
------------------------
62761 

3. Trace the process

[oracle@dbsrv ~]$ strace -tt -f -p 62761
Process 62761 attached - interrupt to quit
11:36:58.158348 --- SIGALRM (Alarm clock) @ 0 (0) ---
11:36:58.158485 rt_sigprocmask(SIG_BLOCK, [], NULL, 8) = 0
....
11:46:58.240065 --- SIGALRM (Alarm clock) @ 0 (0) ---
11:46:58.240211 rt_sigprocmask(SIG_BLOCK, [], NULL, 8) = 0
...
11:46:58.331063 write(20, "\0\n\0\0\6\20\0\0\0\0", 10) = 10
... 

What I did was to attach to the process, simulate some activity at 11:36 and then leave the session IDLE. Then 10 minutes later the server process sent an empty packet to the client to check if the connection is still alive.

Conclusion

Errors in alertlog disappeared after I enabled the DCD.

Make sure to enable DCD if you host your databases in a shared infrastructure or there are firewalls between your database and application servers.

References
How to Check if Dead Connection Detection (DCD) is Enabled in 9i ,10g and 11g (Doc ID 395505.1)
Alert Log Errors: 12170 TNS-12535/TNS-00505: Operation Timed Out (Doc ID 1628949.1)
Resolving Problems with Connection Idle Timeout With Firewall (Doc ID 257650.1)
Dead Connection Detection (DCD) Explained (Doc ID 151972.1)

The post Oracle TNS-12535 and Dead Connection Detection appeared first on Svetoslav Gyurov Oracle blog.


Oracle Exadata X6 released

$
0
0

Oracle has just announced the next generation of Exadata Database Machine – X6-2 and X6-8.

Here are the changes for Exadata X6-2:
1) X6-2 Database Server: As always the hardware has been updated and the 2-socket database servers are now equip with latest twenty two-core Intel Xeon E5-2699 v4 “Broadwell” processors in comparison to X5 where we had eighteen-core Intel Xeon E5-2699 v3 processors. The memory is still DDR4 and the default configuration comes with 256Gb and can be expanded to 768Gb. The local storage can now be upgraded to 8 drives from default of 4 to allow more local storage in case of a consolidation with Oracle OVM.
2) X6-2 Storage Server HC: The storage server gets the new version CPUs as well and that is the ten-core Intel Xeon E5-2630 v4 processor (it was eight-core Intel Xeon E5-2630 v3 in X5). The flash cards are upgraded as well to 3.2 TB Sun Accelerator Flash F320 NVMe PCIe card for a total of 12.8 TB of flash cache (2x the capacity of X5 where we had 1.6Tb F160 cards).
2.1) X6-2 Storage Server EF – similarly to the High Capacity storage server this one gets the CPU and flash card upgraded. Also the NVMe PCIe Flash drives are now upgraded from 1.6Tb to 3.2Tb which gives you a total raw capacity of 25.6Tb per server.

This time Oracle released Exadata X6-8 together with X6-2 release. Changes aren’t many, I have to say that X6-8 compute node looks exactly the same as X5-8 in terms of specs so I guess that Exadata X6-8 actually consists of X5-8 compute nodes with X6-2 storage servers. Oracle’s vision on those big monsters is that they are specifically optimized for Database as a Service (DBaaS) and database in-memory. Indeed with 12Tb of memory we can host hundreds of databases or load a whole database in memory.

By the looks of it Exadata X6-2 and Exadata X6-8 will require the latest Exadata 12.1.2.3.0 software. This software has been around for some time now and has some new features:
1) Performance Improvements for Software Upgrades – I can confirm that, in the recent upgrade to 12.1.2.3.0 the cell upgrade took a bit more than an hour.
2) VLAN tagging support in OEDA – That’s not a fundamental new or exciting new feature but VLAN tagging was available before. Now it can be done through the OEDA hence it can be part of the deployment.
3) Quorum disk on database servers to enable high redundancy on quarter and eighth racks – You can now use database servers to deploy quorum disks and enable placement of voting disk on high redundancy disk groups on smaller (quarter and eight) rack. Here is more information – Managing Quorum Disks Using the Quorum Disk Manager Utility
4) Storage Index preservation during rebalance – The features enables Storage Indexes to be moved along the data when a disk hits predictive failure or true failure.
5) ASM Disk Size Checked When Reducing Grid Disk Size – this is a check on the storage server to make sure you cannot shrink a grid disk before decreasing the size of an ASM disk.

Capacity-On-Demand Licensing:
1) For Exadata X6-2 a minimum of 14 cores must be enabled per server.
2) For Exadata X6-8 a minumum of 56 cores must be enabled per server.

Here’s something interesting:
OPTIONAL CUSTOMER SUPPLIED ETHERNET SWITCH INSTALLATION IN EXADATA DATABASE MACHINE X6-2
Each Exadata Database Machine X6-2 rack has 2U available at the top of the rack that can be used by customers to optionally install their own client network Ethernet switches in the Exadata rack instead of in a separate rack. Some space, power, and cooling restrictions apply.

References:

The post Oracle Exadata X6 released appeared first on Svetoslav Gyurov Oracle blog.

Dead Connection Detection in Oracle Database 12c

$
0
0

In my earlier post I discussed what Dead Connection Detection is and why you should use it – read more here Oracle TNS-12535 and Dead Connection Detection

The pre-12c implementation of DCD used TNS packages to “ping” the client and relied on the underlying TCP stack which sometimes may take longer. Now in 12c this has changed and DCD probes are implemented by TCP Stack. The DCD probes will now use the TCP KEEPALIVE socket option to check if the connection is still usable.

To use the new implementation set the SQLNET.EXPIRE_TIME in sqlnet.ora to the amount of time between the probes in minutes. If the operating system supports TCP keep-alive tuning then Oracle Net automatically uses the new method. The new mechanism is supported on all platforms except on Solaris.

The following parameters are associated with the TCP keep-alive probes:
TCP_KEEPIDLE  – specifies the timeout of no activity until the probe is sent. The parameter takes value from SQLNET.EXPIRE_TIME.
TCP_KEEPCNT   – number of keep-alive probes to be sent, it is always set to 10.
TCP_KEEPINTVL – specifies the delay between probes if a keep-alive packets are sent and no acknowledgment is received, it is always set to 6.

If you need to revert to the pre-12c DCD mechanism (10 bytes TNS data) add the following parameters in sqlnet.ora:
USE_NS_PROBES_FOR_DCD=true

 

The post Dead Connection Detection in Oracle Database 12c appeared first on Svetoslav Gyurov Oracle blog.

How to resolve missing dependency on exadata-sun-computenode-minimum

$
0
0

I’ve been really busy last few months – except spending a lot of time on M25 I’ve been doing a lot of Exadata installations and consolidations. I haven’t posted for some time now but the good news is that I got many drafts and presentations ideas.

This is a quick post about an issue I had recently. I had to integrate AD authentication over Kerberos on the compute nodes (blog post to follow) but had to do compute node upgrade before that. This was Exadata X5-2 QR running 12.1.2.1.1 which had to be upgraded to 12.1.2.3.1 but I was surprised when dbnodeupdate failed with ‘Minimum’ dependency check failed. You’ll also notice the following in the logs:

exa01db01a: Exadata capabilities missing (capabilities required but not supplied by any package)
exa01db01a NOTE: Unexpected configuration - Contact Oracle Support

Starting with 11.2.3.3.0 the exadata-*computenode-exact and exadata-*computenode-minimum rpms were introduced. An update to 11.2.3.3.0 or later by default assumes the ‘exact’ rpm will be used to ‘update to’ with yum hence before running the upgrade dbnodeupdate will check if there are missing packages/dependencies.

Best way to check what is missing is to run yum check:

[root@exa01db01a ~]# yum check
Loaded plugins: downloadonly
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of elfutils-libelf-devel >= ('0', '0.158', '3.2.el6')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of elfutils-libelf-devel(x86-64) >= ('0', '0.158', '3.2.el6')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of glibc-devel(x86-32) >= ('0', '2.12', '1.149.el6_6.5')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of libsepol(x86-32) >= ('0', '2.0.41', '4.el6')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of libselinux(x86-32) >= ('0', '2.0.94', '5.8.el6')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of elfutils-libelf(x86-32) >= ('0', '0.158', '3.2.el6')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of libcom_err(x86-32) >= ('0', '1.42.8', '1.0.2.el6')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of e2fsprogs-libs(x86-32) >= ('0', '1.42.8', '1.0.2.el6')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of libaio(x86-32) >= ('0', '0.3.107', '10.el6')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of libaio-devel(x86-32) >= ('0', '0.3.107', '10.el6')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of libstdc++-devel(x86-32) >= ('0', '4.4.7', '11.el6')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of compat-libstdc++-33(x86-32) >= ('0', '3.2.3', '69.el6')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of zlib(x86-32) >= ('0', '1.2.3', '29.el6')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of libxml2(x86-32) >= ('0', '2.7.6', '17.0.1.el6_6.1')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of elfutils >= ('0', '0.158', '3.2.el6')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of elfutils(x86-64) >= ('0', '0.158', '3.2.el6')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of ntsysv >= ('0', '1.3.49.3', '2.el6_4.1')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of ntsysv(x86-64) >= ('0', '1.3.49.3', '2.el6_4.1')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of glibc(x86-32) >= ('0', '2.12', '1.149.el6_6.5')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of nss-softokn-freebl(x86-32) >= ('0', '3.14.3', '18.el6_6')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of libgcc(x86-32) >= ('0', '4.4.7', '11.el6')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of libstdc++(x86-32) >= ('0', '4.4.7', '11.el6')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of compat-libstdc++-296 >= ('0', '2.96', '144.el6')
exadata-sun-computenode-minimum-12.1.2.1.1.150316.2-1.x86_64 has missing requires of compat-libstdc++-296(x86-32) >= ('0', '2.96', '144.el6')
Error: check all

Somehow all x86-32 packages and three x86-64 packages were removed. The x86-32 packages will be removed during as part of upgrade anyway, they were not present after the upgrade. I didn’t spend too much do understand why or how the packages were removed. I was told additional packages were installed before and then removed. Perhaps one had few dependencies and all got messed up when it was removed.

Anyway to solve this you need to download the patch for the same version (12.1.2.1.1). The p20746761_121211_Linux-x86-64.zip patch is still available from MOS 888828.1. So after that you unzip it, mount the iso, test install all the package to make sure nothing is missing and there are no conflicts and then finally install the packages:

[root@exa01db01a x86_64]# rpm -ivh --test zlib-1.2.3-29.el6.i686.rpm glibc-2.12-1.149.el6_6.5.i686.rpm nss-softokn-freebl-3.14.3-18.el6_6.i686.rpm libaio-devel-0.3.107-10.el6.i686.rpm libaio-0.3.107-10.el6.i686.rpm e2fsprogs-libs-1.42.8-1.0.2.el6.i686.rpm libgcc-4.4.7-11.el6.i686.rpm libcom_err-1.42.8-1.0.2.el6.i686.rpm elfutils-libelf-0.158-3.2.el6.i686.rpm libselinux-2.0.94-5.8.el6.i686.rpm libsepol-2.0.41-4.el6.i686.rpm glibc-devel-2.12-1.149.el6_6.5.i686.rpm elfutils-libelf-devel-0.158-3.2.el6.x86_64.rpm libstdc++-devel-4.4.7-11.el6.i686.rpm libstdc++-4.4.7-11.el6.i686.rpm compat-libstdc++-296-2.96-144.el6.i686.rpm compat-libstdc++-33-3.2.3-69.el6.i686.rpm libxml2-2.7.6-17.0.1.el6_6.1.i686.rpm elfutils-0.158-3.2.el6.x86_64.rpm ntsysv-1.3.49.3-2.el6_4.1.x86_64.rpm
Preparing...                ########################################### [100%]

[root@exa01db01a x86_64]# rpm -ivh zlib-1.2.3-29.el6.i686.rpm glibc-2.12-1.149.el6_6.5.i686.rpm nss-softokn-freebl-3.14.3-18.el6_6.i686.rpm libaio-devel-0.3.107-10.el6.i686.rpm libaio-0.3.107-10.el6.i686.rpm e2fsprogs-libs-1.42.8-1.0.2.el6.i686.rpm libgcc-4.4.7-11.el6.i686.rpm libcom_err-1.42.8-1.0.2.el6.i686.rpm elfutils-libelf-0.158-3.2.el6.i686.rpm libselinux-2.0.94-5.8.el6.i686.rpm libsepol-2.0.41-4.el6.i686.rpm glibc-devel-2.12-1.149.el6_6.5.i686.rpm elfutils-libelf-devel-0.158-3.2.el6.x86_64.rpm libstdc++-devel-4.4.7-11.el6.i686.rpm libstdc++-4.4.7-11.el6.i686.rpm compat-libstdc++-296-2.96-144.el6.i686.rpm compat-libstdc++-33-3.2.3-69.el6.i686.rpm libxml2-2.7.6-17.0.1.el6_6.1.i686.rpm elfutils-0.158-3.2.el6.x86_64.rpm ntsysv-1.3.49.3-2.el6_4.1.x86_64.rpm
Preparing...              ########################################### [100%]
1:libgcc                  ########################################### [  5%]
2:elfutils-libelf-devel   ########################################### [ 10%]
3:nss-softokn-freebl      ########################################### [ 15%]
4:glibc                   ########################################### [ 20%]
5:glibc-devel             ########################################### [ 25%]
6:elfutils                ########################################### [ 30%]
7:zlib                    ########################################### [ 35%]
8:libaio                  ########################################### [ 40%]
9:libcom_err              ########################################### [ 45%]
10:libsepol               ########################################### [ 50%]
11:libstdc++              ########################################### [ 55%]
12:libstdc++-devel        ########################################### [ 60%]
13:libaio-devel           ########################################### [ 65%]
14:libselinux             ########################################### [ 70%]
15:e2fsprogs-libs         ########################################### [ 75%]
16:libxml2                ########################################### [ 80%]
17:elfutils-libelf        ########################################### [ 85%]
18:compat-libstdc++-296   ########################################### [ 90%]
19:compat-libstdc++-33    ########################################### [ 95%]
20:ntsysv                 ########################################### [100%]

[root@exa01db01a x86_64]# yum check
Loaded plugins: downloadonly
check all

After that dbnodeupdate check completed successfully I upgraded the node to 12.1.3.2.1 in no time.

With Exadata you are allowed to install packages on the compute nodes as long as they don’t break any dependencies but you cannot install anything on the storage cells. Here’s oracle official statement:
Is it acceptable / supported to install additional or 3rd party software on Exadata machines and how to check for conflicts? (Doc ID 1541428.1)

Update 23.08.2016:
You might also get errors for two more packages in case you have updated to from OEL5 to OEL6 and now you try to patch the compute node:

fuse-2.8.3-4.0.2.el6.x86_64 has missing requires of kernel >= ('0', '2.6.14',
None)
2:irqbalance-1.0.7-5.0.1.el6.x86_64 has missing requires of kernel >= ('0',
'2.6.32', '358.2.1')

Refer to the following note for more information and how to fix it:

Grid Infrastructure 12c installation fails because of 255 in the subnet ID

$
0
0

I was doing another GI 12.1.0.2 cluster installation last month when I got really weird error.

While root.sh was running on the first node I got the following error:

2016/07/01 15:02:10 CLSRSC-343: Successfully started Oracle Clusterware stack
2016/07/01 15:02:23 CLSRSC-180: An error occurred while executing the command '/ocw/grid/bin/oifcfg setif -global eth0/10.118.144.0:public eth1/10.118.255.0:cluster_interconnect' (error code 1)
2016/07/01 15:02:24 CLSRSC-287: FirstNode configuration failed
Died at /ocw/grid/crs/install/crsinstall.pm line 2398.

I was surprised to find the following error in the rootcrs log file:

2016-07-01 15:02:22: Executing cmd: /ocw/grid/bin/oifcfg setif -global eth0/10.118.144.0:public eth1/10.118.255.0:cluster_interconnect
2016-07-01 15:02:23: Command output:
> PRIF-15: invalid format for subnet
>End Command output

Quick MOS search suggested that my installation failed because I had 255 in the subnet ID:
root.sh fails with CLSRSC-287 due to: PRIF-15: invalid format for subnet (Doc ID 1933472.1)

Indeed we had 255 in the private network subnet (10.118.255.0). Fortunately this was in our private network which was easy to change but you will still hit this issue if you public network  has 255 in the subnet ID.

Extending an Exadata Eighth Rack to a Quarter Rack

$
0
0

In the past year I’ve done a lot of Exadata deployments and probably half of them were eighth racks. It’s one of those temporary things – let’s do it now but we’ll change it later. It’s the same with the upgrades – I’ve never seen anyone doing an upgrade from an eighth rack to a quarter. However, a month ago one of our customers asked me to upgrade their three X5-2 HC 4TB units from an eighth to a quarter rack configuration.

What’s the different between an eighth rack and a quarter rack

X5-2 Eighth Rack and X5-2 Quarter rack have the same hardware and look exactly the same. The only difference is that only half of the compute power and storage space on an eighth rack is usable. In an eighth rack the compute nodes have half of their CPUs activated – 18 cores per server. It’s the same for the storage cells – 16 cores per cell, six hard disks and two flash cards are active.

While this is true for X3, X4 and X5 things have slightly changed for X6. Up until now, eighth rack configurations had all the hard disks and flash cards installed but only half of them were usable. The new Exadata X6-2 Eighth Rack High Capacity configuration has half of the hard disks and flash cards removed. To extend X6-2 HC to a quarter rack you need to add high capacity disks and flash cards to the system. This is only required for High Capacity configurations because X6-2 Eighth Rack Extreme Flash storage servers have all flash drives enabled.

What are the main steps of the upgrade:

  • Activate Database Server Cores
  • Activate Storage Server Cores and disks
  • Create eighth new cell disks per cell – six hard disk and two flash disks
  • Create all grid disks (DATA01, RECO01, DBFS_DG) and add them to the disk groups
  • Expand the flashcache onto the new flash disks
  • Recreate the flashlog on all flash cards

Here are few things you need to keep in mind before you start:

  • Compute nodes upgrade require a reboot for the new changes to come into action.
  • Storage cells upgrade do NOT require a reboot and it is an online operation.
  • Upgrade work is a low risk – your data is secure and redundant at all times.
  • This post is about X5 upgrade. If you were to upgrade X6 then before you begin you need to install the six 8 TB disks in HDD slots 6 – 11 and install the two F320 flash cards in PCIe slots 1 and 4.

Upgrade of the compute nodes

Well, this is really straight forward and you can do it at any time. Remember that you need to restart the server for the change to come into action:

dbmcli -e alter dbserver pendingCoreCount=36 force
DBServer exa01db01 successfully altered. Please reboot the system to make the new pendingCoreCount effective.

Reboot the server to activate the new cores. It will take around 10 minutes for the server to come back online.

Check the number of cores after server comes back:

dbmcli -e list dbserver attributes coreCount
cpuCount:               36/36

 

Make sure you’ve got the right number of cores. These systems allow capacity on demand (CoD) and in my case customer wanted to me activate only 28 cores per server.

Upgrade of the storage cells

Like I said earlier, the upgrade of the storage cells does NOT require reboot and can be done online at any time.

The following needs to be done on each cell. You can, of course, use dcli but I wanted to do that cell by cell and make sure each operation finishes successfully.

1. First, upgrade the configuration from an eighth to a quarter rack:

[root@exa01cel01 ~]# cellcli -e list cell attributes cpuCount,eighthRack
cpuCount:               16/32
eighthRack:             TRUE

[root@exa01cel01 ~]# cellcli -e alter cell eighthRack=FALSE
Cell exa01cel01 successfully altered

[root@exa01cel01 ~]# cellcli -e list cell attributes cpuCount,eighthRack
cpuCount:               32/32
eighthRack:             FALSE

 

2. Create cell disks on top of the newly activated physical disks

Like I said – this is an online operation and you can do it at any time:

[root@exa01cel01 ~]# cellcli -e create celldisk all
CellDisk CD_06_exa01cel01 successfully created
CellDisk CD_07_exa01cel01 successfully created
CellDisk CD_08_exa01cel01 successfully created
CellDisk CD_09_exa01cel01 successfully created
CellDisk CD_10_exa01cel01 successfully created
CellDisk CD_11_exa01cel01 successfully created
CellDisk FD_02_exa01cel01 successfully created
CellDisk FD_03_exa01cel01 successfully created

 

3. Expand the flashcache on to the new flash cards

This is again an online operation and it can be run at any time:

[root@exa01cel01 ~]# cellcli -e alter flashcache all
Flash cache exa01cel01_FLASHCACHE altered successfully

 

4. Recreate the flashlog

The flashlog is always 512MB big but to make use of the new flash cards it has to be recreated. Use the DROP FLASHLOG command to drop the flash log, and then use the CREATE FLASHLOG command to create a flash log. The DROP FLASHLOG command can be run at runtime, but the command does not complete until all redo data on the flash disk is written to hard disk.

Here is an important note from Oracle:

If FORCE is not specified, then the DROP FLASHLOG command fails if there is any saved redo. If FORCE is specified, then all saved redo is purged, and Oracle Exadata Smart Flash Log is removed.

[root@exa01cel01 ~]# cellcli -e drop flashlog
Flash log exa01cel01_FLASHLOG successfully dropped

 

5. Create grid disks

The best way to do that is to query the current grid disks size and use to create the new grid disks. Use the following queries to obtain the size for each grid disk. We use disk 02 because the first two does have DBFS_DG on them.

[root@exa01db01 ~]# dcli -g cell_group -l root cellcli -e "list griddisk attributes name, size where name like \'DATA.*02.*\'"
exa01cel01: DATA01_CD_02_exa01cel01        2.8837890625T
[root@exa01cel01 ~]# dcli -g cell_group -l root cellcli -e "list griddisk attributes name, size where name like \'RECO.*02.*\'"
exa01cel01: RECO01_CD_02_exa01cel01        738.4375G
[root@exa01cel01 ~]# dcli -g cell_group -l root cellcli -e "list griddisk attributes name, size where name like \'DBFS_DG.*02.*\'"
exa01cel01: DBFS_DG_CD_02_exa01cel01       33.796875G

Then you can either generate the commands and run them on each cell or use dcli to create them on all three cells:

dcli -g cell_group -l celladmin "cellcli -e create griddisk DATA_CD_06_\`hostname -s\` celldisk=CD_06_\`hostname -s\`,size=2.8837890625T"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DATA_CD_07_\`hostname -s\` celldisk=CD_07_\`hostname -s\`,size=2.8837890625T"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DATA_CD_08_\`hostname -s\` celldisk=CD_08_\`hostname -s\`,size=2.8837890625T"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DATA_CD_09_\`hostname -s\` celldisk=CD_09_\`hostname -s\`,size=2.8837890625T"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DATA_CD_10_\`hostname -s\` celldisk=CD_10_\`hostname -s\`,size=2.8837890625T"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DATA_CD_11_\`hostname -s\` celldisk=CD_11_\`hostname -s\`,size=2.8837890625T"
dcli -g cell_group -l celladmin "cellcli -e create griddisk RECO_CD_06_\`hostname -s\` celldisk=CD_06_\`hostname -s\`,size=738.4375G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk RECO_CD_07_\`hostname -s\` celldisk=CD_07_\`hostname -s\`,size=738.4375G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk RECO_CD_08_\`hostname -s\` celldisk=CD_08_\`hostname -s\`,size=738.4375G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk RECO_CD_09_\`hostname -s\` celldisk=CD_09_\`hostname -s\`,size=738.4375G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk RECO_CD_10_\`hostname -s\` celldisk=CD_10_\`hostname -s\`,size=738.4375G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk RECO_CD_11_\`hostname -s\` celldisk=CD_11_\`hostname -s\`,size=738.4375G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DBFS_DG_CD_06_\`hostname -s\` celldisk=CD_06_\`hostname -s\`,size=33.796875G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DBFS_DG_CD_07_\`hostname -s\` celldisk=CD_07_\`hostname -s\`,size=33.796875G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DBFS_DG_CD_08_\`hostname -s\` celldisk=CD_08_\`hostname -s\`,size=33.796875G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DBFS_DG_CD_09_\`hostname -s\` celldisk=CD_09_\`hostname -s\`,size=33.796875G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DBFS_DG_CD_10_\`hostname -s\` celldisk=CD_10_\`hostname -s\`,size=33.796875G"
dcli -g cell_group -l celladmin "cellcli -e create griddisk DBFS_DG_CD_11_\`hostname -s\` celldisk=CD_11_\`hostname -s\`,size=33.796875G"

6. The final step is to add newly created grid disks to ASM

Connect to the ASM instance using sqlplus as sysasm and disable the appliance mode:

SQL> ALTER DISKGROUP DATA01 set attribute 'appliance.mode'='FALSE';
SQL> ALTER DISKGROUP RECO01 set attribute 'appliance.mode'='FALSE';
SQL> ALTER DISKGROUP DBFS_DG set attribute 'appliance.mode'='FALSE';

Add the disks to the disk groups, you can either queue them on one instance or run them on both ASM instances in parallel:

SQL> ALTER DISKGROUP DATA01 ADD DISK 'o/*/DATA_CD_0[6-9]*',' o/*/DATA_CD_1[0-1]*' REBALANCE POWER 128;
SQL> ALTER DISKGROUP RECO01 ADD DISK 'o/*/RECO_CD_0[6-9]*',' o/*/RECO_CD_1[0-1]*' REBALANCE POWER 128;
SQL> ALTER DISKGROUP DBFS_DG ADD DISK 'o/*/DBFS_DG_CD_0[6-9]*',' o/*/DBFS_DG_CD_1[0-1]*' REBALANCE POWER 128;

Monitor the rebalance using select * from gv$asm_operations and once done change the appliance mode back to TRUE:

SQL> ALTER DISKGROUP DATA01 set attribute 'appliance.mode'='TRUE';
SQL> ALTER DISKGROUP RECO01 set attribute 'appliance.mode'='TRUE';
SQL> ALTER DISKGROUP DBFS_DG set attribute 'appliance.mode'='TRUE';

And at this point, you are done with the upgrade. I strongly recommend you to run (latest) exachk report and make sure there are no issues with the configuration.

A problem you might encounter is that the flash is not fully utilized, in my case I had 128MB free on each card:

[root@exa01db01 ~]# dcli -g cell_group -l root "cellcli -e list celldisk attributes name,freespace where disktype='flashdisk'"
exa01cel01: FD_00_exa01cel01         128M
exa01cel01: FD_01_exa01cel01         128M
exa01cel01: FD_02_exa01cel01         128M
exa01cel01: FD_03_exa01cel01         128M
exa01cel02: FD_00_exa01cel02         128M
exa01cel02: FD_01_exa01cel02         128M
exa01cel02: FD_02_exa01cel02         128M
exa01cel02: FD_03_exa01cel02         128M
exa01cel03: FD_00_exa01cel03         128M
exa01cel03: FD_01_exa01cel03         128M
exa01cel03: FD_02_exa01cel03         128M
exa01cel03: FD_03_exa01cel03         128M

This seems to be a known bug and to fix it you need to recreate both flashcache and flashlog.

References:
Extending an Eighth Rack to a Quarter Rack in Oracle Exadata Database Machine X4-2 and Later
Oracle Exadata Database Machine exachk or HealthCheck (Doc ID 1070954.1)
Exachk fails due to incorrect flashcache size after upgrading from 1/8 to a 1/4 rack (Doc ID 2048491.1)

How to enable Exadata Write-Back Flash Cache

$
0
0

Yes, this is well-known and the process has been described in Exadata Write-Back Flash Cache – FAQ (Doc ID 1500257.1) but what the note fails to make clear is that you do NOT have to restart cell services anymore hence resync the griddisks!

I had to enable the WBFC many times before and every time I’d restart the cell services, as note suggests. Well, this is not required anymore, starting with 11.2.3.3.1 it is no longer necessary to shut down the cellsrv service on the cells when changing the flash cache mode. This is not big deal if you deploy the Exadata just now but it makes enabling/disabling WBFC for existing systems quicker and much easier.

The best way to do that is to use the script that Oracle has provided – setWBFC.sh. It will do all the work for you – pre-checks and changing the mode, either rolling or non-rolling.

Here are the checks it does for you:

  • Storage cells are valid storage nodes running at least 11.2.3.2.1 or later across all cells.
  • Griddisks status is ONLINE across all cells.
  • No ASM rebalance operations are running.
  • Flash cache state across all cells are “NORMAL”.

Enable Write-Back Flash Cache using a ROLLING method

Before you enable WBFC run a precheck to make sure the cells are ready and there are no faults.

./setWBFC.sh -g cell_group -m WriteBack -o rolling -p

At the end of the script which takes less than two minutes to run you’ll see message if storage cells passed the prechecks:

All pre-req checks completed:                    [PASSED]
2016-10-10 10:53:03
exa01cel01: flashcache size: 5.82122802734375T
exa01cel02: flashcache size: 5.82122802734375T
exa01cel03: flashcache size: 5.82122802734375T

There are 3 storage cells to process.

Then, once you are ready you run the script to enable the WBFC:

./setWBFC.sh -g cell_group -m WriteBack -o rolling

The script will go through the following steps on each cell, one cell at a time:

1. Recheck griddisks status to make sure none are OFFLINE
2. Drop flashcache
3. Change WBFC flashcachemode to WriteBack
4. Re-create the flashcache
5. Verify flashcachemode is in the correct state

On a Quarter Rack it took around four minutes to enable WBFC and you’ll this message at the end:

2016-10-10 11:23:24
Setting flash cache to WriteBack completed successfully.

Disable Write-Back Flash Cache using a ROLLING method

Disabling WBFC is not something you do every day but soon or later you might have to do it. I had to do it once for a customer who wanted to go back to WriteThrough because Oracle ACS said this was the default ?!

The steps to disable WBFC are the same as enabling it except that we need to flush all the dirty blocks off the flashcache before we drop it.

Again, run the precheck script to make sure everything looks good:

./setWBFC.sh -g cell_group -m WriteThrough -o rolling -p

if everything looks good then run the script:

./setWBFC.sh -g cell_group -m WriteThrough -o rolling

The script will first FLUSH flashcache across all cells in parallel and wait until the flush is complete!

You can monitor the flush process using the following commands:

dcli -l root -g cell_group cellcli -e "LIST CELLDISK ATTRIBUTES name, flushstatus, flusherror" | grep FD
dcli -l root -g cell_group cellcli -e "list metriccurrent attributes name,metricvalue where name like \'FC_BY_DIRTY.*\' "

The script will then go through the following steps on each cell, one cell at a time:

1. Recheck griddisks status to make sure none are OFFLINE
2. Drop flashcache
3. Change WBFC flashcachemode to WriteThrough
4. Re-create the flashcache
5. Verify flashcachemode is in the correct state

The time it takes to flush the cache depends on how dirty blocks you’ve got in the flashcache and the machine workload. I did two eighth racks and unfortunately, I didn’t check the number of dirty blocks but it took 75mins on the first one and 4hrs on the second.

OTN Appreciation Day: Oracle Data Guard Fast-Start Failover

$
0
0

Thank you, Tim, for the great idea.

There are so many cool database features one could spend weeks blogging about them.

A feature which I like very much is Oracle DataGuard Fast-Start Failover, FSFO for short.

Oracle DataGuard Fast-Start Failover was one of the many new features introduced in Oracle Database 10.2. It’s an addition to the already available DataGuard option to maintain standby databases. DataGuard FSFO is a feature that automatically, quickly, and reliably fails over to a designated, synchronized standby database in the event of loss of the production database, without requiring manual intervention.

In FSFO configuration there are three participants – primary database, standby database and an observer and they follow a very simple rule – whichever two can communicate with each other will determine the outcome of fast-start failover. The observer usually runs on a third machine, requires only Oracle client and will continuously monitor the primary and standby databases for possible failure conditions.

FSFO solves the problem we used to have with clusters before – a “split brain” scenario where after a failure of the connection between the cluster nodes we end up having two primary databases. FSFO also gives you the option to establish an acceptable time limit (in seconds) that the designated standby is allowed to fall behind the primary database (in terms of redo applied), beyond which time a fast-start failover will not be allowed.

Oracle DataGuard Fast-Start Failover can be used only in a broker configuration in either maximum availability mode or maximum performance mode.

I don’t have post on FSFO (yet) but here are the links to the documentation:

Oracle Database 12.1 Data Guard Concepts and Administration

Oracle Database 12.1 Data Guard Broker

Oracle Database 12.1 Fast-Start Failover


Speaking at BGOUG and UKOUG

$
0
0

It’s my pleasure to be speaking at BGOUG and UKOUG again this year.

The coming Wednesday 19th Oct, I’ll be speaking at the UKOUG Systems SIG event here in London (agenda). I’ll talk about Exadata implementations I did last year and issues I encountered. Also, things you need to keep in mind when you plan to extend the system, attach it to ZFS Storage Appliance or Exalytics.

Next is all time my favorite user group conference BGOUG. It’s held in Pravetz, Bulgaria between 11-13 November. With an excellent line of speakers is one not to miss (agenda). I’ll be speaking on Saturday at 10:00 about Protecting single instance databases with Oracle Clusterware 12c. In case you don’t have RAC, RAC One Node or 3rd party cluster licenses but you still need high availability for your database. I’ll go through the clusters basics, the difference between single instance, RAC and RAC One Node and then more technical details around the implementation of a single instance failover cluster.

Finally, it’s the UKOUG Tech 16 with its massive 14 streams of sessions between 5-7 December and speakers from around the world (agenda). I’ll be speaking on Tuesday 11:35 about Exadata extension use cases. I’ll talk about the Exadata extension I did and what to keep in mind if you plan one. In particular extension of a quarter rack to an eighth rack, expansion of Exadata with more compute nodes or storage cell and extension of X3-8 two-rack configuration with another X4-8 rack.

I’d like to thank my company (Red Stack) for the support and BGOUG and UKOUG committees for accepting my sessions.

See you there!

 

Exadata memory configuration

$
0
0

Read this post if your Exadata compute nodes have 512/768GB of RAM or you plan to upgrade to the same.

There has been a lot of information about hugepages and I wouldn’t go into too much details. For efficiency, the (x86) CPU allocates RAM by chunks (pages) of 4K bytes and those pages can be swapped to disk. For example, if your SGA allocates 32GB this will take 8388608 pages and given that Page Table Entry consume 8bytes that’s 64MB to look-up. Hugepages, on the other hand, are 2M. Pages that are used as huge pages are reserved inside the kernel and cannot be used for other purposes.  Huge pages cannot be swapped out under memory pressure, obviously there is decreased page table overhead and page lookups are not required since the pages are not subject to replacement. The bottom line is that you need to use them, especially now with the amount of RAM we get nowadays.

For every new Exadata deployment I usually set the amount of hugepages to 60% of the physical RAM:
256GB RAM = 150 GB (75k pages)
512GB RAM = 300 GB (150k pages)
768GB RAM = 460 GB (230k pages)

This allows databases to allocate SGA from the hugepages. If you want to allocate the exact number of hugepages that you need, Oracle has a script which will walk through all instances and give you the number of hugepages you need to set on the system, you can find the Doc ID in the reference below.

This also brings important point – to make sure your databases don’t allocate from both 4K and 2M pages make sure the parameter use_large_pages is set to ONLY for all databases. Starting with 11.2.0.3 (I think) you’ll find hugepages information in the alertlog when database starts:

************************ Large Pages Information *******************
Per process system memlock (soft) limit = 681 GB

Total Shared Global Region in Large Pages = 2050 MB (100%)

Large Pages used by this instance: 1025 (2050 MB)
Large Pages unused system wide = 202863 (396 GB)
Large Pages configured system wide = 230000 (449 GB)
Large Page size = 2048 KB
********************************************************************

 

Now there is one more parameter you need to change if you deploy or upgrade Exadata with 512/768GB of RAM. That is the total amount of shared memory, in pages, that the system can use at one time or kernel.shmall. On Exadata, this parameter is set to 214G by default which is enough if your compute nodes have only 256GB of RAM. If the sum of all databases SGA memory is less than 214GB that’s ok but the moment you try to start another database you’ll get the following error:

Linux-x86_64 Error: 28: No space left on device

For that reason, if you deploy or upgrade Exadata with 512G/768GB of physical RAM make sure you upgrade kernel.shmall  too!

Some Oracle docs suggest this parameter should be set to the half of the physical memory, other suggest it should be set to the all available memory. Here’s how to calculate it:

kernel.shmall = physical RAM size / pagesize

To get the pagesize run getconf PAGE_SIZE on the command prompt. You need to set shmall to at least match the size of the hugepages – because that’s where we’d allocate SGA memory from. So if you run Exadata with 768G of RAM and have 460 GB of hugepages you’ll set shmall to 120586240 (460GB / 4K pagesize).

Using HUGEPAGES does not alter the calculation for configuring shmall!

Reference:
HugePages on Linux: What It Is… and What It Is Not… (Doc ID 361323.1)
Upon startup of Linux database get ORA-27102: out of memory Linux-X86_64 Error: 28: No space left on device (Doc ID 301830.1)

Installing OEM13c management agent in silent mode

$
0
0

For many customers I work with SSH won’t be available between the OEM and monitoring hosts. Therefore you cannot push the management agent on to the host from the OEM console. Customer might have to raise CR to allow SSH between the hosts but this might take a while and it’s really unnecessary.

In that case the management agent has to be installed in silent mode. That is when the agent won’t be pushed from OEM to the host but pulled by the host and installed. There are thrее ways to do that – using AgentPull script, agentDeploy script or using RPM file.

When using AgentPull script, you download a script from the OMS and then run it. It will download the agent package and install it on the local host. Using the agentDeploy script is very similar but to obtain it you use EMCLI. The third method of using RPM file is similar – using EMCLI you download RPM file and install it on the system. These methods require HTTP/HTTPS access to the OMS, AgentDeploy and RPM file also require ELICLI to be installed. For that reason I always use AgentPull method since it’s quicker and really straight forward. Another benefit of using AgentPull method is that if you don’t have HTTP/HTTPS access to the OEM you can simply copy and paste the script.

Download a script from the OEM first, use curl or wget. The monitoring hosts usually don’t have HTTP access but many of them do so by using HTTP proxy. Download the script and make it executable:

curl "https://oem13c.local.net:7802/em/install/getAgentImage" --insecure -o AgentPull.sh
chmod +x AgentPull.sh

If a proxy server is not available then you can simply copy and paste the script, the location of the script on the OEM server is:

/opt/app/oracle/em13c/middleware/oms/install/unix/scripts/AgentPull.sh.template

Make sure you edit the file change the oms host and oms port parameters.

Check what are the available platforms:

[oracle@exa01db01 ~]$ ./AgentPull.sh -showPlatforms

Platforms    Version
Linux x86-64    13.1.0.0.0

Run the script by suppling the sysman password, platform you want to install, agent registration password and the agent base directory:

[oracle@exa01db01 ~]$ ./AgentPull.sh LOGIN_USER=sysman LOGIN_PASSWORD=welcome1 \
PLATFORM="Linux x86-64" AGENT_REGISTRATION_PASSWORD=welcome1 \
AGENT_BASE_DIR=/u01/app/oracle/agent13c

It takes less than two minutes to install the agent and then at the end you’ll see the following messages:

Agent Configuration completed successfully
The following configuration scripts need to be executed as the "root" user. Root script to run : /u01/app/oracle/agent13c/agent_13.1.0.0.0/root.sh
Waiting for agent targets to get promoted...
Successfully Promoted agent and its related targets to Management Agent

Now login as root and run the root.sh file.

To be honest, I use this method regardless of the circumstances, it’s so much easier and faster.

 

Unable to perform initial elastic configuration on Exadata X6

$
0
0

I had the pleasure to deploy another Exadata in the first week of 2017 and got my first issue this year.

As we know starting with Exadata X5, Oracle introduced the concept of Elastic Configuration. Apart from allowing you to mix and match the number of compute nodes and storage cells they have also changed how the IP addresses are assigned on the admin (eth0) interface. Prior X5, Exadata had default IP addresses set at the factory in the range of IP addresses was 192.168.1.1 to 192.168.1.203 but since this could collide with the customer’s network they changed the way those IPs are assigned. In short – the IP address on eth0 on the compute nodes and storage cells is assigned within 172.16.2.1 to 172.16.7.254 range. The first time node boots it will assign its hostname and IP address based on the IB ports its connected to.

Now to the real problem, I was doing the usual stuff – changing ILOMs, setting cisco and IB switches and was about to perform the initial elastic configuration (applyElasticConfig.sh) so I had upload all the files I need for the deployment on the first compute node. I’ve changed my laptop address to an IP within the same range and was surprised when I got connection timed out when I tried to ssh to the first compute node (172.16.2.44). I thought this was an unfortunate coincidence since I rebooted the IB switches almost at the time I powered on the compute nodes but I was wrong. For some reason, ALL servers did not get their eth0 IP addresses assigned hence they were not accessible.

I was very surprised to what’s causing this issue and I’ve spent the afternoon troubleshooting it. I thought Oracle changed the way they assign the IP addresses but the scripts haven’t been changed for a long time. It didn’t take long before I find out what was causing it. Three lines in /sbin/ifup script were the reason eth0 interface wasn’t up with the 172.2.16.X IP address:

if ip link show ${DEVICE} | grep -q “UP”; then
exit 0
fi

This line will check if the interface is UP before proceeding further and bring the interface up. Actually, the eth0 interface is brought UP already by the elastic configuration script to check if there is a link on the interface. Then at the end of the script when ifup script is invoked to bring the interface up it will stop the execution since the interface is already UP.

The solution is really simple – comment out the three lines (line 73-75) in /sbin/ifup script and reboot each node.

This wasn’t the first X6 I deploy and I never had this problem before so I did some further investigation. The /sbin/ifup scripts is part of initscripts package. It turns out that the check for the interface being UP was introduced in one minor version of the package and then removed in the latest package. Unfortunately, the last entry in the Changelog is from Apr 12 2016 so that’s not very helpful but here’s a summary:

initscripts-9.03.53-1.0.1.el6.x86_64.rpm           11-May-2016 19:49     947.9 K  <– not affected
initscripts-9.03.53-1.0.1.el6_8.1.x86_64.rpm     12-Jul-2016 16:42     948.0 K    <– affected
initscripts-9.03.53-1.0.2.el6_8.1.x86_64.rpm     13-Jul-2016 08:26     948.1 K   <– affected
initscripts-9.03.53-1.0.3.el6_8.2.x86_64.rpm     23-Nov-2016 05:06     948.3 K <– latest version, not affected

I had this problem on three Exadata machine so far. So, if you are doing deployment of new Exadata in the next few days or weeks it’s very likely that you will be affected, unless your Exadata has been factory deployed after 23rd Nov 2016. That’s the day when the latest initscripts package was released.

onecommand fails to change storage cell name

$
0
0

It’s been a busy month – five Exadata deployments in the past three weeks and new personal best – 2x Exadata X6-2 Eighth Racks with CoD and storage upgrade deployed in only 6hrs!

An issue I encountered with the first deployment was that onecommand wouldn’t change the storage cells names. The default cell names (not hostnames!) are based on where they are mounted within the rack and they are assigned by the elastic configuration script. The first cell name is ru02 (rack unit 02), the second cell is ru04, third is ru06 and so on.

Now, if you are familiar with the cell and grid disks you would know that their names are based on the cell name. In other words, I got my cell, grid and ASM disks with the wrong names. Exachk would report the following failures for every grid disk:

Grid Disk name DATA01_CD_00_ru02 does not have cell name (exa01cel01) suffix
Naming convention not used. Cannot proceed further with
automating checks and repair for bug 12433293

Apart from exachk complaining, I wouldn’t feel comfortable with similar names on my Exadata.

Fortunately cell, grid and ASM disk names can be changed and here is how to do it:

Stop the cluster and CRS on each compute node:

/u01/app/12.1.0.2/grid/bin/crsctl stop cluster -all
/u01/app/12.1.0.2/grid/bin/crsctl stop crs

Login to each storage server and rename cell name, cell and grid disks, use the following to build the alter commands:

You don’t need cell services shut but the grid disks shouldn’t be in use i.e. make sure to stop the cluster first!

cell -e alter cell name=exa01cel01
for i in `cellcli -e list celldisk | awk '{print $1}'`; do echo "cellcli -e alter celldisk $i name=$i"; done | sed -e "s/ru02/exa01cel01/2"
for i in `cellcli -e list griddisk | awk '{print $1}'`; do echo "cellcli -e alter griddisk $i name=$i"; done | sed -e "s/ru02/exa01cel01/2"

If you get the following error restart the cell services and try again:

GridDisk DATA01_CD_00_ru02 alter failed for reason: CELL-02548: Grid disk is in use.

Start the cluster on each compute node:

/u01/app/12.1.0.2/grid/bin/crsctl start crs

 

We’ve got all cell and grid disks fixed, now we need to rename the ASM disks. To rename ASM disk you need to mount the diskgroup in restricted mode i.e. running on one node only and no one using it. If the diskgroup is not in restricted mode you’ll get:

ORA-31020: The operation is not allowed, Reason: disk group is NOT mounted in RESTRICTED state.

 

Stop the second compute node, default dbm01 database and the MGMTDB database:

srvctl stop database -d dbm01
srvctl stop mgmtdb

Mount diskgroups in restricted mode:

If you are running 12.1.2.3.0+ and high redundancy DATA diskgroup, it is  VERY likely that the voting disks are in the DATA diskgroup. Because of that, you wouldn’t be able to dismount the diskgroup. The only way I found around that was to force stop ASM and start it manually in a restricted mode:

srvctl stop asm -n exa01db01 -f

sqlplus / as sysasm

startup mount restricted

alter diskgroup all dismount;
alter diskgroup data01 mount restricted;
alter diskgroup reco01 mount restricted;
alter diskgroup dbfs_dg mount restricted;

 

Rename the ASM disks, use the following build the alter commands:

select 'alter diskgroup ' || g.name || ' rename disk ''' || d.name || ''' to ''' || REPLACE(d.name,'RU02','exa01cel01')  || ''';' from v$asm_disk d, v$asm_diskgroup g where d.group_number=g.group_number and d.name like '%RU02%';

select 'alter diskgroup ' || g.name || ' rename disk ''' || d.name || ''' to ''' || REPLACE(d.name,'RU04','exa01cel03') || ''';' from v$asm_disk d, v$asm_diskgroup g where d.group_number=g.group_number and d.name like '%RU04%';

select 'alter diskgroup ' || g.name || ' rename disk ''' || d.name || ''' to ''' || REPLACE(d.name,'RU06','exa01cel03') || ''';' from v$asm_disk d, v$asm_diskgroup g where d.group_number=g.group_number and d.name like '%RU06%';

 

Finally stop and start CRS on both nodes.

 

It’s only when I thought everything was ok I discovered one more reference to those pesky names. These were the fail group names which again are based on the storage cell name. Following will make it more clear:

select group_number,failgroup,mode_status,count(*) from v$asm_disk where group_number > 0 group by group_number,failgroup,mode_status;

GROUP_NUMBER FAILGROUP                      MODE_ST   COUNT(*)
———— —————————— ——- ———-
1 RU02                           ONLINE          12
1 RU04                           ONLINE          12
1 RU06                           ONLINE          12
1 EXA01DB01                  ONLINE           1
1 EXA01DB02                  ONLINE           1
2 RU02                           ONLINE          10
2 RU04                           ONLINE          10
2 RU06                           ONLINE          10
3 RU02                           ONLINE          12
3 RU04                           ONLINE          12
3 RU06                           ONLINE          12

For each diskgroup we’ve got three fail groups (three storage cells). The other two fail groups EXA01DB01 and EXA01DB02 are the quorum disks.

Unfortunately, you cannot rename failgroups in ASM. My immediate thought was to drop each failgroup and add it back with the intention that it will resolve the problem. Unfortunately, since this was a quarter rack I couldn’t do it, here’s an excerpt from the documentation:

If a disk group is configured as high redundancy, then you can do this procedure on a Half Rack or greater. You will not be able to do this procedure on a Quarter Rack or smaller with high redundancy disk groups because ASM will not allow you to drop a failure group such that only one copy of the data remains (you’ll get an ORA-15067 error).

The last option was to recreate the diskgroups. I’ve done this many times before when the compatible.rdbms parameter was set to too high and I had to install some earlier version of 11.2. However, since oracle decided to move the voting disks to DATA this became a bit harder. I couldn’t drop DBFS_DG because that’s where the MGMTDB was created, I couldn’t drop DATA01 either because of the voting disks and some parameter files. I could have renamed RECO01 diskgroup but decided to keep it “consistently wrong” across all three diskgroups.

Fortunately, this behvaiour might change with the January 2017 release of OEDA. The following bug fix suggests that DBFS_DG will always be configured as high redundancy and host the voting disks:

24329542: oeda should make dbfs_dg as high redundancy and locate ocr/vote into dbfs_dg

There is also a feature request to support failgroup rename but it’s not very popular, to be honest. Until we get this feature, exachk will report the following failure:

failgroup name (RU02) for grid disk DATA01_CD_00_exa01cel01 is not cell name
Naming convention not used. Cannot proceed further with
automating checks and repair for bug 12433293

I’ve deployed five Exadata X6-2 machines so far and had this issue on all of them.

This issue seems to be caused a bug in OEDA. The storage cell names should have been changed as part of step “Create Cell Disks” of onecommand. I keep the logs from some older deployments where it’s very clear that each cell was renamed as part of this step:

Initializing cells...

EXEC## |cellcli -e alter cell name = exa01cel01|exa01cel01.local.net|root|

I couldn’t find that command in the logs of the deployements I did. Obviously, the solution for now, is to manually rename the cell before you run step “Create Cell Disks” of onecommand.

 

Upgrading to Exadata 12.1.2.2.0 or later – mind the 32bit packages

$
0
0

This might not be relevant anymore, shame on me for keeping it draft for few months. However, there are still people running an older versions of Exadata storage software and it might still help someone out there.

With the release of Exadata storage software 12.1.2.2.0, Oracle announced that some 32bit (i686) packages will be removed from the OS as part of the upgrade.

This happened to me in the summer last year (blog post) and I thought back then that someone messed up the dependencies. After seeing it again for another customer a month later I thought it might be something else. So after checking the release notes for all the recent patches, I found this for the 12.1.2.2.0 release:

Note that several i686 packages may be removed from the database note when being updated. Run dbnodeupdate.sh with -N flag at prereq check time to see exactly what rpms will be removed from your system.

Now, this will be all ok if you haven’t installed any additional packages. If you, however, like many other customers has packages like LDAP or Kerberos then your dbnodeupdate pre-check will fail with “Minimum’ dependency check failed.” and broken dependencies since all i686 package will be removed as part of dbnodeupdate pre-check.

The way around that is to run dbnodeupdate with -N flag and check the logs of what packages will be removed and what will be impacted. Then manually remove any packages you installed manually. After the Exadata storage software update, you’d need to install the relevant version of the packages again.

Having said that I need to mention the below note on what software is allowed to install on Exadata:

Is it acceptable / supported to install additional or 3rd party software on Exadata machines and how to check for conflicts? (Doc ID 1541428.1)

 

Oracle Database 12.2 released for Exadata on-premises

$
0
0

We live in exciting times, Oracle Database 12.2 for Exadata was released earlier today.

The 12.2 database was already available on the Exadata Express Cloud Service and Database as a service for a few months now.

Today, it has been released for Exadata on-premises, five days earlier than the initial Oracle announcement of 15th Feb.

The documentation suggests that to run 12.2 database you need to run at least 12.1.2.2.1 Exadata storage software but better go with the recommended version of 12.2.1.1.0 which was released just a few days ago. Here are few notes on running 12.2 database on Exadata:

Recommended Exadata storage software: 12.2.1.1.0 or higher
Supported Exadata storage software: 12.1.2.2.1 or higher

Full Exadata offload functionality for Database 12.2, and IORM support for Database 12.2 container databases and pluggable databases requires Exadata 12.2.1.1.0 or higher.

Exadata Storage Server version 12.2.1.1.0 will be required for full Exadata functionality including  ‘Smart Scan offloaded filtering’, ‘storage indexes’ and’

Current Oracle Database and Grid Infrastructure version must be 11.2.0.3, 11.2.0.4, 12.1.0.1 or 12.1.0.2.  Upgrades from 11.2.0.1 or 11.2.0.2 directly to 12.2.0.1 are not supported.

There is a completely new note on how to upgrade to 12.2 GI and RDBMS on Exadata:

12.2 Grid Infrastructure and Database Upgrade steps for Exadata Database Machine running 11.2.0.3 and later on Oracle Linux (Doc ID 2111010.1)

The 12.2 GI and RDBMS binaries are available from MOS as well as edelivery:

Patch 25528839: Grid Software clone version 12.2.0.1.0
Patch 25528830: Database Software clone version 12.2.0.1.0

The recommended Exadata storage software for running 12.2 RDBMS:

Exadata 12.2.1.1.0 release and patch (21052028) (Doc ID 2207148.1)

For more details about 12.2.1.1.0 Exadata storage software refer to this slide deck and of course a link to the 12.2 documentation here as we all need to start getting familiar with it.

Also, last week Oracle released the February version of OEDA to support the new Exadata SL6 hardware. It does not support 12.2 database yet and I guess we’ll see another release in February to support the 12.2 GI and RDBMS.

Happy upgrading! 🙂


How to install Oracle Database Cloud Backup Module

$
0
0

I had the joy of using the Oracle Database Cloud backup module a month ago. Needless to say that things changed since last year and some of the blogs posts I found were not relative anymore.

I used the database cloud backup module to take a backup of a local database and restore it in the cloud.

A good start would be this link to the documentation.

The installation is really simple, download the installation from here, unzip the file and install the module. I’ve cut most of the output to keep it simple and focus on the actual steps.

Now here’s the fun bit, it took me almost than an hour to install it and that’s because of the values I had to pass in the host parameter:

java -jar opc_install.jar -host https://em2.storage.oraclecloud.com/v1/Storage-a468785 -opcId 'svetoslav.gyurov@redstk.com' -opcPass 'welcome1' -walletDir $ORACLE_HOME/opc -libDir $ORACLE_HOME/lib

where em2 is EMEA Commecial 2 data centre, a468785 is your indentity domain and then you supply your crentials as well.

That’s it, once installed you can now run backups of your database to the cloud:

 RMAN> set encryption on identified by welcome1 only;
RMAN> run
{
allocate CHANNEL c1 DEVICE TYPE sbt PARMS='SBT_LIBRARY=libopc.so, SBT_PARMS=(OPC_PFILE=/u01/app/oracle/product/11.2.0.4/db_1/dbs/opcorcl.ora)';
backup database;
}

Keep in mind that the cloud backup module requires an encryption hence you need to set a password on the backup at least or you’ll get the following error:

 ORA-27030: skgfwrt: sbtwrite2 returned error
ORA-19511: non RMAN, but media manager or vendor specific failure, error text:
KBHS-01602: backup piece 0qkqgtua_1_1 is not encrypted 

 

Now, on the other host, start a dummy instance and restore the spfile file first:

RMAN> set dencryption on identified by welcome1;
RMAN> set dbid=1000458402;
RMAN> run {
allocate CHANNEL c1 DEVICE TYPE sbt PARMS='SBT_LIBRARY=libopc.so, SBT_PARMS=(OPC_PFILE=/u01/app/oracle/product/11.2.0.4/db_1/dbs/opcorcl.ora)';
restore spfile to pfile '/u01/app/oracle/product/11.2.0.4/db_1/dbs/initorcl.ora' from autobackup;
}

Then restart the instance in nomount state with the new spfile and restore the database, as simple as that:

RMAN> set dencryption on identified by welcome1;
RMAN> set dbid=1000458402;
RMAN> run {
allocate CHANNEL c1 DEVICE TYPE sbt PARMS='SBT_LIBRARY=libopc.so, SBT_PARMS=(OPC_PFILE=/u01/app/oracle/product/11.2.0.4/db_1/dbs/opcorcl.ora)';
restore controlfile from autobackup;
alter database mount;
restore database;
}

If you got a relatively small database estate that’s a good way to backup your database to offside location. It my case it was a bit slow, it took abour 30mins to backup 4GB database but the restore within cloud was quicker.

Viewing all 76 articles
Browse latest View live