
BigAdmin News
Information & Resources
delivered to you.
Subscribe today!
» More
|
|
Solaris Service Management Facility - Service Developer Introduction
General Concepts
The smf (5) Service Model
The
service management facility defines a programming model for providing
persistently running applications called services. A service can
represent a number of software facilities, such as a set of running
processes, a set of system configuration parameters, or a synthetic set
of running services. A Solaris service is only started if it is marked
as enabled (by the administrator), and once all of its dependencies are
satisfied.
Taking the time to convert your existing services to smf (5)
allows them to take easy advantage of automated restart capabilities
due to hardware failure, unexpected service failure, or administrative
error. Participation in the service management facility also brings
enhanced visibility with svcs (1) (as well as future-planned GUI tools) and ease of management with svcadm (1M)
and other Solaris management tools. This usually only requires creation
of a short XML file and making a few simple modifications to the
service init script.
XML versus repository
A
service is usually defined by a service manifest, an XML file which
describes a service and any instances associated with that service. The
service manifest is pulled into the repository either at boot time, or
by using the svccfg import subcommand. The XML format of the service manifest is specified by the Service description DTD, located at /usr/share/lib/xml/dtd/service_bundle.dtd.1 .
The
repository is where the authoritative copy of service configuration
lives. This is where administrators may customize service settings once
they are installed on a system.
The rest of this document covers creating a service manifest, which is the delivery mechanism for all services.
Instance versus service split
A
service consists of the general service definition and one or more
instances which implement that service. An instance's properties are
inherited from the service, unless if the instance specifically
overrides them.
Thus, as general guidance for the rest of this
document, all properties that would not be changed by a different copy
of your service running (if your service supports that), should be
defined at the service level.
If your service may be
implemented differently (e.g. 'smtp' may be implemented by sendmail,
postfix, qmail, ...) by a different instance, you should locate the
properties that are specific to the current implementation at the
instance, not the service.
All properties discussed below may be defined at the instance or service level.
Compatibility and caveats
smf(5) maintains compatibility for most applications started by init (1M) by placement in the /etc/rc?.d directories, and for applications delivered into inetd.conf .
Some init services, however, must be converted to smf to preserve their boot-time ordering. An init
service needs to convert if it affects other infrastructure services,
like the early setup of devices, filesystems, or network configuration.
A service also needs to convert if it requires input from the console
during the boot process. (Such services are strongly discouraged.)
Writing a Service Manifest
Name your service
We provide general service
categories for naming. These categories aren't used by the system, but
help the administrator in identifying the general use of the service.
These categories are shown in /var/svc/manifest , and include:
application -- higher level applications, such as apache
milestone -- collections of other services, such as name-services
platform -- platform-specific services, such as Dynamic Reconfiguration daemons
system -- Solaris system services, such as coreadm
device -- device-specific services
network -- network/internet services, such as protocols
site -- site specific descriptions
The
service name describes what is being provided, and includes both any
category identifier and the actual service name, separated by '/'.
Service names should usefully identify the service being provided by
the administrator.
The instance name describes any specific
features about the instance. Most services deliver a 'default'
instance. Some (e.g. Oracle) may want to create instances based on
administrative configuration choices.
Services that are
shipped as part of a product or generally extend beyond a site-specific
definition should include either the stock symbol or Java-style
reversed domain prefix followed by a comma as part of the category or
service name for uniqueness.
As an example of the naming conventions above, the cron service specifies as its prelude:
<service
name='system/cron'
type='service'
version='1'>
Identify whether your service may have multiple instances
If multiple binaries of your service running simultaneously on the system would cause an error, you must define it as a single_instance
service. This tag tells the restarter to not start up multiple service
instances simultaneously, regardless of administrative configuration.
Most configuration and system services require single_instance
tags. Services such as web servers or databases which could run
multiple configurations simultaneously (such as use a different
database source or run on a different port) should not be specified as single_instance .
Specify after the service block:
<single_instance />
Identify your service model
In order to provide restart capabilities for services with different run-time characteristics, smf (5) provides a variety of models for services. Currently, these models are provided by the svc.startd and inetd
restarters. Additional models may be provided in the future by either
these restarters or by additional restarters. While this document
describes the models for svc.startd (1M) and inetd (1M), please see the restarter documentation for more detail on the application model it provides.
If your service is started by inetd see "Writing an inetd service manifest" below, as we have provided a tool to ease the transition.
svc.startd is a process-based restarter. It provides three distinct models for service processes:
Transient
services are often configuration services, which require no
long-running processes to provide service. Common transient services
take care of boot-time cleanup or load of configuration properties into
the kernel.
Transient services are also sometimes used to
overcome difficulties in conforming to the method requirements for
contract or wait services. This is not recommended and should be
considered a stopgap measure.
Wait services run for the lifetime of the child process, and are restarted when that process exits.
Contract
services are the standard system daemons. They require processes which
run forever once started to provide service. Death of all processes in
a contract service is considered a service error, which will cause the
service to restart.
The default service model is contract, but may be modified by specifying the following in your service manifest for a transient service:
<property_group name='startd' type='framework'>
<propval name='duration' type='astring' value='transient' />
</property_group>
and the following for a wait service:
<property_group name='startd' type='framework'>
<propval name='duration' type='astring' value='child' />
</property_group>
Identify how your service is started/stopped.
smf interacts with your service primarily by its methods. The stop and start methods must be provided for services managed by svc.startd , and can either directly invoke a service binary or a script which handles care of more complex setup. The refresh method is optional for svc.startd managed services. Different restarters may require different methods.
Existing init
scripts can easily serve as the basis for service methods. We give the
following rules and guidance for the methods supported by svc.startd :
- all methods
Shell scripts should include /lib/svc/share/smf_include.sh to gain access to convenience functions and return value definitions.
Failures
must cause explicit error returns. All non-0 values are considered
errors. Additional information (for example, to avoid restart due to
configuration errors) may be provided to the restarter with the SMF_EXIT_* definitions.
Method should emit log messages on failure. They'll be logged by svc.startd to the service log file, so the administrator can determine what's going on.
The keywords :kill and :true are available for all method definitions.
:true simply returns success to the restarter.
:kill kills all processes started by your service's start method. The list of all processes is determined by the service's contract.
- start methods
A start method is required for all svc.startd -managed
services.
start methods are only run when the service is enabled and
dependencies are already met. Therefore, start methods should
exit with SMF_EXIT_ERR_CONFIG if the service cannot come
online due to any configuration error.
If your service is of type contract, the start method must
leave your daemon running if returning success, as exit of all processes
will cause the service to be restarted.
For contract and transient services, the start
method should not return success until service is being provided.
Note that this is true for daemons as well; daemons shouldn't
fork() then exit() from their initial
process, they should wait to return until startup errors have been
accumulated and can be reported. Many init scripts
used to start up the daemon and return immediately, counting on the
fact that the serial boot took 'a while' to start dependent services.
Now that dependent services are started precisely (often immediately)
after your service returns successfully from its start method,
imprecise semantics are not acceptable.
If code changes to the daemon/service can't be made, a positive test
for service is required before returning success. If no other
options are available, insert an appropriately long
sleep() before successful return.
- stop methods
A stop method is required for all svc.startd -managed
services.
Stop methods are run in a number of different scenarios, including
if a dependency has gone offline, if your service fails, and
if an administrator requests disable or restart.
Thus, stop methods should return success if the service is no
longer running after execution is complete, even if the service
wasn't running when the execution started. This is because
stop methods may be called in error scenarios.
- refresh methods
Refresh methods are optional for all
svc.startd -managed services.
Any defined refresh method has very precise semantics; it must
reload appropriate configuration parameters from the repository
or other configuration source without interrupting service. It
must not cause exit of the existing processes for contract or
wait services.
Timeouts must be provided for all methods. The timeout should be
defined to be the maximal amount of time in seconds that your method
might take to run on a slow system or under heavy load. A method which
exceeds its timeout will be killed. If the method could potentially
take an unbounded amount of time, such as a large filesystem fsck,
an infinite timeout may be specified as '0'.
We strongly discourage expecting user interaction (i.e. via console
input) as part of the service methods. Scripts which do so will
not work without modification, as the stdin/stdout/stderr
are not /dev/console for service methods.
We provide a set of method tokens available for use in method
specification for commonly used property values. A comprehensive
list is available in smf_method (5).
The default method environment is inherited from init (1M),
with the PATH set to /usr/sbin:/usr/bin .
Variables beginning with SMF_ are reserved for framework use.
The SMF_ variables defined in smf_method (5) are
provided to all methods; these include SMF_FMRI ,
SMF_METHOD , and SMF_RESTARTER .
Finally, each method may specify a method context, to define
system and security attributes used during method execution.
We recommend long-running services are started are started with
reduced privileges and safe uids and gids, when possible.
An example of a start method specification is below.
<exec_method
type='method'
name='start'
exec='/lib/svc/method/svc-cron'
timeout_seconds='60'>
<method_context>
<method_credential user='root' group='root' />
</method_context>
</exec_method>
Determine faults to be ignored
If either your service is poorly behaved itself, or it might spawn
poorly behaved subprocesses, you will want to inform the restarter
that certain errors are expected and don't constitute service faults.
You may specify that coredumps from subprocesses shouldn't be considered
errors, or that external kill signals aren't errors. An example
of specifying that neither are errors is below.
<property_group name='startd' type='framework'>
<propval name='ignore_error' type='astring' value='core,signal' />
</property_group>
Identify dependencies
This is the most difficult part of service conversion, as most
dependencies are not explicitly stated. There are two different types of
dependencies; file and service dependencies.
First, identify what other services are required for yours to be
started. For example, does your service require the network to be
plumbed, local devices to be configured, name services to be
available?
Once you've decided what your service is dependent on, you'll need
to specify the fault propagation model. For each dependency, decide
whether your service should restart if:
none -- the dependency is required only for startup. No
fault or administrative action requires restart
fault -- restart if the dependency has a fault (core
dump, system fault, etc.)
restart -- if the dependency is restarted, your service
should be
refresh -- if the dependency is refreshed (its
configuration is changed), your service should be restarted
These values correspond to the ability to handle restart of the
specified dependency, via the restart_on property.
Dependencies may be specified in groupings. The potential groupings
are:
require_all -- all in the group must be online or
degraded before the dependency is started
require_any -- any one of the services in the group
must be online or degraded before the dependency is started
optional_all -- if the services are enabled and able
to run (not in maintenance), they must be online or degraded before
the dependency is started
exclude_all -- if the service is enabled and online or
degraded, the dependency should not be started
If your service is dependent on a legacy script having run, we strongly
recommend you either convert or encourage your vendor to convert the
legacy script to an smf (5) service. Barring that, you can
specify a dependency on the milestone that script is part of. This
will never propagate errors from the legacy service, so only makes sense
as a restart_on=none dependency.
Finally, since you did the hard work to determine why a certain
dependency was required, write a comment to help future maintainers!
<!-- Must be able to resolve hostnames. -->
<dependency
name='nameservice'
type='service'
grouping='require_all'
restart_on='none'>
<service_fmri value='svc:/milestone/name-services' />
<dependency>
Identify dependents
If you wish to deliver a service which is a dependency of another
service that you don't deliver, you can specify this in your manifest
without modifying the manifest you don't own. That is, dependent
specifications are an easy way to have your service run before a
service delivered by Sun.
If not all of your dependent services have been converted, you'll
need to convert those too, as there is no way to specify a dependent
on a legacy script.
To avoid conflicts, we recommend prefacing your dependent name with
the name of your service.
For example, if you're delivering a service (mysvc ) that
must start before syslog, use the following:
<dependent
name='mysvc_syslog'
grouping='optional_all'
restart_on='none'>
<service_fmri value='svc:/system/system-log' />
<dependent>
Insert your service into a milestone
If your service was previously delivered into an rc?.d
directory and other services might depend on you, you should make
milestone corresponding to your previous delivery location a
dependent.
For example, if your service was previously started at runlevel 2,
this clause will make sure that runlevel 2 is not considered complete
until your service has started.
<dependent
name='mysvc_multi-user'
grouping='require_all'
restart_on='none'>
<service_fmri value='svc:/milestone/multi-user' />
<dependent>
Create, if appropriate, a default instance
If your service doesn't require additional administrative intervention
for configuration before it starts the first time, you should configure
a default instance for your service.
If the instance has no configuration differences from the service,
this can easily be done with:
<create_default_instance enabled='false' />
Alternatively, you can explicitly define the instance.
<instance name='default' enabled='false'>
<!-- instance-specific properties, methods, etc. go here. -->
</instance>
We recommend that all instances are delivered as disabled unless if
they are critical to system boot. Customization can then be done
by either the administrator or a profile (described elsewhere).
Create template information to describe your service
Document at least a common name in the C locale and a manpage
reference. The common name should
- be short (40 characters or less),
- avoid capital letters aside from trademarks like Solaris,
- avoid punctuation, and
- avoid the word service (but do distinguish between client and
server).
This information is presented by various forms of svcs (1)
to provide the administrator with concise detail about your service and
where to get more technical information. Common names may be localized.
<template>
<common_name>
<loctext xml:lang='C'>
Solaris fault manager
<loctext>
<common_name>
<documentation>
<manpage title='fmd' section='1M' manpath='/usr/share/man' />
<documentation>
<template>
Write/update an administrative command
If your service already has an administrative command which stops,
starts, or restarts your service, update it to use svcadm (1M),
or libscf calls. If an administrative command explicitly starts a daemon
outside of smf (5), the smf (5) framework
won't know there are other daemons running. Conflicts between daemons,
incorrect contracts, and lack of visibility using svcs (1)
are among the problems that will occur.
Remove your script from /etc/rc?.d locations and
/etc/init.d
If you don't remove your init script, it will still be run in
legacy mode.
For more information
The DTD is self-documenting. Many questions can be resolved by
just reading /usr/share/lib/xml/dtd/service_bundle.dtd
on your Solaris 10 system.
Sun delivers many manifests in /var/svc/manifest. These may be
used as templates and examples. A few to start with:
The following manpages are a helpful start: smf (5),
smf_bootstrap (5), smf_method (5),
smf_restarter (5), smf_security (5),
svc.startd (1M), inetd (1M),
inetconv (1M).
Finally, the SMF System
Administration Guide Documentation, and Predictive
Self-Healing site on Bigadmin are also useful sources of information.
Writing an inetd service manifest
Start with inetconv (1M), and include other modifications
such as adding templates and refining the name.
More to come on this and:
- packaging
- removing pre-converted
inetd.conf services
Coming soon:
- Packaging
- Installing your service
- Logfiles
- Testing your service
- Upgrade considerations
- Using smf properties for service configuration.
- action_authorizations -- creating fine-grained administrative
roles for specific service administration.
|