uNIX sEMAPHORES aND sHARED mEMORY eXPLAINED

Unix Semaphores and Shared Memory Explained
===========================================

General
=======

Shared memory is exactly that - a memory region that can shared between
different processes. Oracle uses shared memory for implementing the
SGA, which needs to be visible to all database sessions. Shared memory
is also used in the implementation of the SQL*Net V1 Fast driver as a
means of communicating between the application and shadow process. On
the RS/6000, each shadow process stores its PGA in a shared memory
segment (however, only the shadow attaches this segment). In the
latter two cases, Oracle allocates the shared memory dynamically as
opposed to the allocation of the SGA, which occurs at instance startup.
This allocation will not be discussed in this paper.

Semaphores can be thought of as flags (hence their name, semaphores).
They are either on or off. A process can turn on the flag or turn it off.
If the flag is already on, processes who try to turn on the flag will
sleep until the flag is off. Upon awakening, the process will
reattempt to turn the flag on, possibly suceeding or possibly sleeping
again. Such behaviour allows semaphores to be used in implementing a
post-wait driver - a system where processes can wait for events (i.e.
wait on turning on a semphore) and post events (i.e. turning of a
semaphore). This mechanism is used by Oracle to maintain concurrency
control over the SGA, since it is writeable by all processes attached.
Also, for the same reasons, use of the Fast Driver requires additional
semaphores. However, these semaphores will be allocated dynamically
instead of at instance startup. This allocation will not be discussed in
this paper.

Instance startup
================

On instance startup, the first things that the instance does is:

-Read the "init.ora"

-Start the background processes

-Allocate the shared memory and semphores required

The size of the SGA will be calculated from various "init.ora" parameters.
This will be the amount of shared memory required. The SGA is broken into 4
sections - the fixed portion, which is constant in size, the variable portion,
which varies in size depending on "init.ora" parameters, the redo block
buffer, which has its size controlled by log_buffers, and the db
block buffer, which has its size controlled by db_block_buffers.

The size of the SGA is the sum of the sizes of the 4 portions.
There is unfortunately no simple formula for determining the size
of the variable portion. Generally, the shared pool dominates all
other parts of the variable portion, so as a rule of thumb, one can
estimate the size as the value of shared_pool_size (in v6, one can
ignore the size of the variable portion).

The number of semphores required is much simpler to determine. Oracle will
need exactly as many semaphores as the value of the processes "init.ora"
parameter.

Note that the recommended kernel parameter values in the ICG are enough
to support the default database (4M SGA, 50 processes), but may be
insufficient to run a larger instance. With the above estimations and the
information which follows, a DBA should be able to build a kernel with
appropriate settings to support the instance.


Shared memory allocation
========================

Oracle has 3 different possible models for the SGA - one-segment,
contiguous multi-segment, and non-contiguous multi-segment.
When attempting to allocate and attach shared memory for the SGA, it
will attempt each one, in the above order, until one succeeds or raises
an ORA error. On other, non-fatal, errors, Oracle simply cleans up and
tries again using the next memory model. The entire SGA must fit into
shared memory, so the total amount of shared memory allocated under any
model will be equal to the size of the SGA. This calculated value will
be referred to below as SGASIZE.

The one-segment model is the simplest and first model tried. In this
model, the SGA resides in only one shared memory segment. Oracle attempts
to allocate and attach one shared memory segement of size equal to total
size of the SGA. However, if the SGASIZE is larger than the configured
SHMMAX, this will obviously fail (with EINVAL). In this case, the SGA will
need to be placed in multiple shared memory segments, and Oracle proceeds
to the next memory model for the SGA. If an error other than EINVAL occurs
when allocating the shared memory with shmget(), Oracle will raise an
ORA-7306. If the segment was received (i.e. if SHMMAX > SGASIZE), Oracle
attempts to attach it at the start address defined in ksms.o. An error
on the attach will raise an ORA-7307.

With multiple segments there are two possibilities. The segments
can be attached contiguously, so that it appears to be one large
shared memory segment, or non-contiguously, with gaps between the
segments. The former wastes less space that could be used for the stack
or heap, but depending on alignment requirements for shared memory
(defined by SHMLBA in the kernel), it may not be possible.

At this point, Oracle needs to determine SHMMAX so it can determine how many
segments will be required. This is done via a binary search
algorithm over the range [1...SGASIZE] (since Oracle is trying this
model and not the one segment model it must be that SHMMAX.dbf" file is used to
get the necessary information. In version 7, the SGA itself contains
the information about the shared memory and semaphores (how the
bootstrap works will be explained later). In either case, the
information stored is the same - the key, id, size, and attach
address of each shared memory segment and the key, id, and size of
each semaphore set. Note that we need not do anything special to
initialize the semaphores. We can use them with the data structure
we read in on connecting.

The version 6, approach is rather simple. It first tries to open the
"sgadef.dbf" file. If it cannot, an ORA-7318 is raised. Once
opened, the data written earlier on startup is read. If an error
occurs for some reason on the read, an ORA-7319 occurs. Once all the
data is read in, Oracle attaches each segment in turn.

First, it generates what it believes the key for the segment should be. It
then gets that segment, returning ORA-7429 if it fails. The key used
and the key stored are then compared. They should be equal, but if
not, an ORA-7430 occurs. Once the key is verified, the segment is
attached. A failure to attach the segment raises an ORA-7320. If
the segment is attached, but not at the address we requested, an
ORA-7321 occurs. This process is repeated for all segments until the
entire SGA is attached.

Version 7 differs only in the first part, when the shared memory and
semaphore data is read. Once that data is read in, Oracle proceeds in
the same manner. To fetch this data, Oracle generates what it thinks
should be the key for the first segment of the SGA and attaches it
as if it were the only segment. Once it is attached, the data is
copied from the SGA. With this data, Oracle attaches any remaining
segments for the SGA.

There is one possible problem. If somehow two instances have a key
collision (i.e. they both generate the same key for their first segment), it
is possible to only have one of the two instances up at a time! Connection
attempts to either one will connect a user to whichever instance is up.
This is rare, but can happen. Development is currently working on a better
key generation algorithm.

Note: See Note 399261.1 for information regarding 10g and newer releases
as a new feature known as NUMA optimization.


Attaching shared memory
=======================

As seen in previous sections, shared memory must be received (this may
mean allocating the shared memory, but not necessarily) and then
attached, to be used. Attaching shared memory brings the shared
memory into the process' memory space. There are some important
things about attach addresses. For one thing, they may need to be
alligned on some boundary (generally defined by SHMLBA). More
importantly, shared memory must mapped to pages in the process'
memory space which are unaccounted for. Every process already has a
text, a data, and a stack segment laid out as follows (in general):

+---------+ high addresses
| stack |
|---------| -+
| | | |
| v | |
|---------| |
| shm seg | |- unused portion
|---------| | These are valid pages for shared memory
| ^ | | Pages are allocated from this area
| | | | as both the stack and heap(data) grow
|---------| -+
| data |
|---------|
| text |
+---------+ low addresses

So, valid attach addresses lie in the unused region between the stack
and the data segments (a shared memory segment is drawn in the
diagram to aid in visualization - not every process has shared memory
attached!). Of course, the validity also depends on the
size of the segment, since it cannot overlap another segment. Note
that both the stack and data segments can grow during the life of a
process. Because segments must be contiguous and overlapping is not
allowed, this is of some importance.

Attaching shared memory creates a limit on how much the stack or data segment
can grow. Limiting the stack is typically not a problem, except when running
deeply recursive code. Neither is limiting the data segment, but this does
restrict the amount memory that can be dynamically allocated by a
program. It is possible (but seldom) that some applications
running against the database may hit this limit in the shadow (since
the shadow has the SGA attached). This is the cause of ORA-7324 and
ORA-7325 errors. How to deal with these is discussed in the
troubleshooting section.

The SGA is attached, depending on the allocation model used, more or
less contiguously (there may be gaps, but those can be treated as if
they were part of the shared memory). So where the beginning of the
SGA can be attached depends on the SGA's size. The default address
which is chosen by Oracle is generally sufficient for most SGAs.
However, it may be necessary to relocate the SGA for very large
sizes. It may also need to be changed if ORA-7324 or ORA-7325 errors
are occuring. The beginning attach address is defined in the file
"ksms.s". Changing the attach address requires recompilation of the
Oracle kernel and should not be done without first consulting Oracle
personnel. Unfortunately, there is no good way to determine what a good
attach address will be.

When changing the address to allow a larger SGA, a good rule of thumb is
taking the default attach address in "ksms.s" and subtracting the size of
the SGA. The validity of an attach address can be tested with the Oracle
provided tstshm executable. Using:

tstshm -t -b

will determine if the address is usable or not.


Troubleshooting
===============

Errors which might have multiple causes are discussed in this
sections. Errors not mentioned here generally have only one cause
which has a typically obvious solution.

ORA-7306, ORA-7336, ORA-7329
Oracle received a system error on a shmget() call. The system error
should be reported. There are a few possibilities:

1) There is insufficient shared memory available. This is
indicated by the operating system error ENOSPC. Most likely, SHMMNI
is too small. Alternatively, there may shared memory already
allocated; if it is not attached, perhaps it can be freed. Maybe
shared memory isn't configured in the kernel.

2) There is insufficient memory available. Remember, shared memory
needs pages of virtual memory. The system error ENOMEM indicates there
is insufficient virtual memory. Swap needs to be increased, either by
adding more or by freeing currently used swap (i.e. free other shared
memory, kill other processes)

3) The size of the shared memory segment requested is invalid. In this
case, EINVAL is returned by the system. This should be very rare - however,
it is possible. This can occur if SHMMAX is not a mulitple of page
size and Oracle is trying a multi-segment model. Remember that Oracle
rounds its calculation of SHMMAX to a page boundary, so it may have
rounded it up past the real SHMMAX! (Whether this is a bug is
debatable).

4) The shared memory segment does not exist. This would be indicated
by the system error ENOENT. This would never happen on startup; it
only would happen on connects. The shared memory most likely has been
removed unexpectedly by someone or the instance is down.

ORA-7307, ORA-7337, ORA-7320
Oracle received a system error on a shmat() call. The system should be
reported. There a a few possibilities:

1) The attach address is bad. If this is the cause, EINVAL is returned
by the system. Refer to the section on the attach address to see why
the attach address might be bad. This may happen after enlarging the
SGA.

2) The permissions on the segment do not allow the process to attach
it. The operating system error will be EACCES. Generally the cause of
this is either the setuid bit is not turned on for the oracle
executable, or root started the database (and happens to own the shared
memory). Normally, this would be seen only on connects.

3) The process cannot attach any more shared memory segments. This
would be accompanieed by the system error EMFILE. SHMSEG is too
small. Note that as long as SHMSEG is greater than SS_SEG_MAX, you
should never see this happen.

ORA-7329, ORA-7334
Oracle has determined the SGA needs too many shared memory segments. Since you
can't change the limit on the number of segments, you should instead increase
SHMMAX so that fewer segments are required.

ORA-7339
Oracle has determined it needs too many semaphore sets. Since you
can't change the limit on the number of semaphore sets, you should
increase SEMMSL so fewer sets are required.

ORA-7250, ORA-7279, ORA-7252, ORA-27146
Oracle received a system error on a semget() call. The system error should be
reported. There should be only one system error ever returned with
this, ENOSPC. This can mean one of two things. Either the system
limit on sempahore sets has been reached or the system limit on the
total number of semaphores has been reached. Raise SEMMNI or SEMMNS,
as is appropriate, or perhaps there are some semaphore sets which can
be released. In the case of ORA-7250, ORANSEMS may be set too high
(>SEMMSL). If it is, raise SEMMSL or decrease ORANSEMS.

ORA-7251
Oracle failed to allocate even a semaphore set of only one semaphore. It is
likely that semaphores are not configured in the kernel.

ORA-7318
Oracle could not open the sgadef file. The system error number will be
returned. There are a few possible causes:

1) The file doesn't exist. In this case, the system error ENOENT is
returned. Maybe ORACLE_SID or ORACLE_HOME is set wrong so that Oracle
is looking in the wrong place. Possibly the file does not exist (in this
case, a restart is necessary to allow connections again).

2) The file can't be accessed for reading. The operating system error returned
with this is EACCES. The permissions on the file (or maybe
directories) don't allow an open for reading of the sgadef file. It
might not be owned by the oracle owner. The setuid bit might not be
turned on for the oracle executable.

ORA-7319
Oracle did not find all the data it expected when reading the
sgadef.dbf file. Most likely the file has been truncated. The
only recovery is to restart the instance.

ORA-7430
Oracle expected a key to be used for the segment which does not match the
key stored in the shared memory and semaphore data structure. This probably
indicates a corruption of the sgadef file (in version 6) or
the data in the first segment of the SGA (in version 7). A restart of
the instance is probably necessary to recover in that case. It may
also be a key collision problem and Oracle is attached to the wrong
instance.

ORA-7321
Oracle was able to attach the segment, but not at the address it
requested. In most cases, this would be caused by corrupted data in
the sgadef file (in version 6) or the first segment of the SGA (in
version 7). A restart of the database may be necessary to recover.

ORA-7324, ORA-7325
Oracle was unable to allocate memory. Most likely, the heap (data
segment) has grown into the bottom of the SGA. Relocating the SGA to a
higher attach address may help, but there may be other causes. Memory
leaks can cause this error. The init.ora parameter sort_area_size may be
too large, decreasing it may resolve the error. The init.ora parameter
context_incr may also be too large, decreasing it may resolve this

ORA-7264, ORA-7265
Oracle was unable to decrement/increment a semaphore. This generally
is accompanied by the system error EINVAL and a number which is the
identifier of the semaphore set. This is almost always because the
semaphore set was removed, but the shadow process was not aware of it
(generally due to a shutdown abort or instance crash). This error
is usually ignorable.

System Parameters
=================

SHMMAX - kernel parameter controlling maximum size of one shared memory
segment
SHMMNI - kernel parameter controlling maximum number of shared memory segments
in the system
SHMSEG - kernel parameter controlling maximum number of shared memory segments
a process can attach
SEMMNS - kernel parameter controlling maximum number of semphores in
the system
SEMMNI - kernel parameter controlling maximum number of semaphore
sets. Semphores in Unix are allocated in sets of 1 to SEMMSL.
SEMMSL - kernel parameter controlling maximum number of semaphores in a
semphore set.
SHMLBA - kernel parameter controlling alignment of shared memory
segments; all segments must be attached at multiples of this value.
Typically, non-tunable.


System errors
=============

ENOENT - No such file or directory, system error 2
ENOMEM - Not enough core, system error 12
EACCES - Permission denied, system error number 13
EINVAL - Invalid argument, system error number 22
EMFILE - Too many open files, system error number 24
ENOSPC - No space left on device, system error number 28


Oracle parameters
=================

SS_SEG_MAX - Oracle parameter specified at compile time (therefore,
unmodifiable without an Oracle patch) which defines maximum
number of segements the SGA can reside in. Normally set to 20.
SS_SEM_MAX - Oracle parameter specified at compile time (therefore,
unmodifiable without an Oracle patch) which defined maximum
number of semaphore sets oracle will allocate. Normally set to 10.

Comments

Popular posts from this blog

configure Netbackup email notification on Unix