Sun Microsystems - BigAdmin: Solaris Service Management Facility


United States	How To Buy \| My Sun \| Worldwide Sites

BigAdmin System Administration Portal

Predictive Self-Healing

BigAdmin
»	Articles/FAQs
»	Collections
»	Discussions
»	Docs
»	Education
»	HCL
»	MOTDs
»	Newsletter
»	Patches
»	Products
»	Resources
»	Scripts
»	Services/Support
»	ShellMe
»	Software
»	Suggestions
»	XPerts

Search BigAdmin

BigAdmin News

Information & Resources
delivered to you.
Subscribe today!
» More

Solaris Service Management Facility - Service Developer Introduction

General Concepts

The smf(5) Service Model

The service management facility defines a programming model for providing persistently running applications called services. A service can represent a number of software facilities, such as a set of running processes, a set of system configuration parameters, or a synthetic set of running services. A Solaris service is only started if it is marked as enabled (by the administrator), and once all of its dependencies are satisfied.

Taking the time to convert your existing services to smf(5) allows them to take easy advantage of automated restart capabilities due to hardware failure, unexpected service failure, or administrative error. Participation in the service management facility also brings enhanced visibility with svcs(1) (as well as future-planned GUI tools) and ease of management with svcadm(1M) and other Solaris management tools. This usually only requires creation of a short XML file and making a few simple modifications to the service init script.
XML versus repository

A service is usually defined by a service manifest, an XML file which describes a service and any instances associated with that service. The service manifest is pulled into the repository either at boot time, or by using the svccfg import subcommand. The XML format of the service manifest is specified by the Service description DTD, located at /usr/share/lib/xml/dtd/service_bundle.dtd.1.

The repository is where the authoritative copy of service configuration lives. This is where administrators may customize service settings once they are installed on a system.

The rest of this document covers creating a service manifest, which is the delivery mechanism for all services.
Instance versus service split

A service consists of the general service definition and one or more instances which implement that service. An instance's properties are inherited from the service, unless if the instance specifically overrides them.

Thus, as general guidance for the rest of this document, all properties that would not be changed by a different copy of your service running (if your service supports that), should be defined at the service level.

If your service may be implemented differently (e.g. 'smtp' may be implemented by sendmail, postfix, qmail, ...) by a different instance, you should locate the properties that are specific to the current implementation at the instance, not the service.

All properties discussed below may be defined at the instance or service level.
Compatibility and caveats

smf(5) maintains compatibility for most applications started by init(1M) by placement in the /etc/rc?.d directories, and for applications delivered into inetd.conf.

Some init services, however, must be converted to smf to preserve their boot-time ordering. An init service needs to convert if it affects other infrastructure services, like the early setup of devices, filesystems, or network configuration. A service also needs to convert if it requires input from the console during the boot process. (Such services are strongly discouraged.)

Writing a Service Manifest

Name your service

We provide general service categories for naming. These categories aren't used by the system, but help the administrator in identifying the general use of the service.

These categories are shown in /var/svc/manifest, and include:
- application -- higher level applications, such as apache
- milestone -- collections of other services, such as name-services
- platform -- platform-specific services, such as Dynamic Reconfiguration daemons
- system -- Solaris system services, such as coreadm
- device -- device-specific services
- network -- network/internet services, such as protocols
- site -- site specific descriptions
The service name describes what is being provided, and includes both any category identifier and the actual service name, separated by '/'. Service names should usefully identify the service being provided by the administrator.

The instance name describes any specific features about the instance. Most services deliver a 'default' instance. Some (e.g. Oracle) may want to create instances based on administrative configuration choices.

Services that are shipped as part of a product or generally extend beyond a site-specific definition should include either the stock symbol or Java-style reversed domain prefix followed by a comma as part of the category or service name for uniqueness.

As an example of the naming conventions above, the cron service specifies as its prelude:
```
      <service
         name='system/cron'
         type='service'
         version='1'>
		
```
Identify whether your service may have multiple instances

If multiple binaries of your service running simultaneously on the system would cause an error, you must define it as a single_instance service. This tag tells the restarter to not start up multiple service instances simultaneously, regardless of administrative configuration.

Most configuration and system services require single_instance tags. Services such as web servers or databases which could run multiple configurations simultaneously (such as use a different database source or run on a different port) should not be specified as single_instance.

Specify after the service block:
```
      <single_instance />
   		
```
Identify your service model

In order to provide restart capabilities for services with different run-time characteristics, smf(5) provides a variety of models for services. Currently, these models are provided by the svc.startd and inetd restarters. Additional models may be provided in the future by either these restarters or by additional restarters. While this document describes the models for svc.startd(1M) and inetd(1M), please see the restarter documentation for more detail on the application model it provides.

If your service is started by inetd see "Writing an inetd service manifest" below, as we have provided a tool to ease the transition.

svc.startd is a process-based restarter. It provides three distinct models for service processes:
- Transient services are often configuration services, which require no long-running processes to provide service. Common transient services take care of boot-time cleanup or load of configuration properties into the kernel.
  
  Transient services are also sometimes used to overcome difficulties in conforming to the method requirements for contract or wait services. This is not recommended and should be considered a stopgap measure.
- Wait services run for the lifetime of the child process, and are restarted when that process exits.
- Contract services are the standard system daemons. They require processes which run forever once started to provide service. Death of all processes in a contract service is considered a service error, which will cause the service to restart.
The default service model is contract, but may be modified by specifying the following in your service manifest for a transient service:
```
      <property_group name='startd' type='framework'>
           <propval name='duration' type='astring' value='transient' />
      </property_group>
		
```
and the following for a wait service:
```
      <property_group name='startd' type='framework'>
           <propval name='duration' type='astring' value='child' />
      </property_group>
		
```
Identify how your service is started/stopped.

smf interacts with your service primarily by its methods. The stop and start methods must be provided for services managed by svc.startd, and can either directly invoke a service binary or a script which handles care of more complex setup. The refresh method is optional for svc.startd managed services. Different restarters may require different methods.

Existing init scripts can easily serve as the basis for service methods. We give the following rules and guidance for the methods supported by svc.startd:
- all methods
  - Shell scripts should include /lib/svc/share/smf_include.sh to gain access to convenience functions and return value definitions.
  - Failures must cause explicit error returns. All non-0 values are considered errors. Additional information (for example, to avoid restart due to configuration errors) may be provided to the restarter with the SMF_EXIT_* definitions.
  - Method should emit log messages on failure. They'll be logged by svc.startd to the service log file, so the administrator can determine what's going on.
  - The keywords :kill and :true are available for all method definitions.
    
    :true simply returns success to the restarter.
    
    :kill kills all processes started by your service's start method. The list of all processes is determined by the service's contract.
- start methods
  - A start method is required for all svc.startd-managed services.
  - start methods are only run when the service is enabled and dependencies are already met. Therefore, start methods should exit with SMF_EXIT_ERR_CONFIG if the service cannot come online due to any configuration error.
  - If your service is of type contract, the start method must leave your daemon running if returning success, as exit of all processes will cause the service to be restarted.
  - For contract and transient services, the start method should not return success until service is being provided. Note that this is true for daemons as well; daemons shouldn't fork() then exit() from their initial process, they should wait to return until startup errors have been accumulated and can be reported. Many init scripts used to start up the daemon and return immediately, counting on the fact that the serial boot took 'a while' to start dependent services. Now that dependent services are started precisely (often immediately) after your service returns successfully from its start method, imprecise semantics are not acceptable.
    
    If code changes to the daemon/service can't be made, a positive test for service is required before returning success. If no other options are available, insert an appropriately long sleep() before successful return.
- stop methods
  - A stop method is required for all svc.startd-managed services.
  - Stop methods are run in a number of different scenarios, including if a dependency has gone offline, if your service fails, and if an administrator requests disable or restart.
  - Thus, stop methods should return success if the service is no longer running after execution is complete, even if the service wasn't running when the execution started. This is because stop methods may be called in error scenarios.
- refresh methods
  - Refresh methods are optional for all svc.startd-managed services.
  - Any defined refresh method has very precise semantics; it must reload appropriate configuration parameters from the repository or other configuration source without interrupting service. It must not cause exit of the existing processes for contract or wait services.
Timeouts must be provided for all methods. The timeout should be defined to be the maximal amount of time in seconds that your method might take to run on a slow system or under heavy load. A method which exceeds its timeout will be killed. If the method could potentially take an unbounded amount of time, such as a large filesystem fsck, an infinite timeout may be specified as '0'.

We strongly discourage expecting user interaction (i.e. via console input) as part of the service methods. Scripts which do so will not work without modification, as the stdin/stdout/stderr are not /dev/console for service methods.

We provide a set of method tokens available for use in method specification for commonly used property values. A comprehensive list is available in smf_method(5).

The default method environment is inherited from init(1M), with the PATH set to /usr/sbin:/usr/bin. Variables beginning with SMF_ are reserved for framework use. The SMF_ variables defined in smf_method(5) are provided to all methods; these include SMF_FMRI, SMF_METHOD, and SMF_RESTARTER.

Finally, each method may specify a method context, to define system and security attributes used during method execution. We recommend long-running services are started are started with reduced privileges and safe uids and gids, when possible.

An example of a start method specification is below.
```
      <exec_method
         type='method'
         name='start'
         exec='/lib/svc/method/svc-cron'
         timeout_seconds='60'>
         <method_context>
            <method_credential user='root' group='root' />
         </method_context>
      </exec_method>
   
```
Determine faults to be ignored

If either your service is poorly behaved itself, or it might spawn poorly behaved subprocesses, you will want to inform the restarter that certain errors are expected and don't constitute service faults.

You may specify that coredumps from subprocesses shouldn't be considered errors, or that external kill signals aren't errors. An example of specifying that neither are errors is below.
```
      <property_group name='startd' type='framework'>
         <propval name='ignore_error' type='astring' value='core,signal' />
      </property_group>
   
```
Identify dependencies

This is the most difficult part of service conversion, as most dependencies are not explicitly stated. There are two different types of dependencies; file and service dependencies.

First, identify what other services are required for yours to be started. For example, does your service require the network to be plumbed, local devices to be configured, name services to be available?

Once you've decided what your service is dependent on, you'll need to specify the fault propagation model. For each dependency, decide whether your service should restart if:
1. none -- the dependency is required only for startup. No fault or administrative action requires restart
2. fault -- restart if the dependency has a fault (core dump, system fault, etc.)
3. restart -- if the dependency is restarted, your service should be
4. refresh -- if the dependency is refreshed (its configuration is changed), your service should be restarted
These values correspond to the ability to handle restart of the specified dependency, via the restart_on property.

Dependencies may be specified in groupings. The potential groupings are:
1. require_all -- all in the group must be online or degraded before the dependency is started
2. require_any -- any one of the services in the group must be online or degraded before the dependency is started
3. optional_all -- if the services are enabled and able to run (not in maintenance), they must be online or degraded before the dependency is started
4. exclude_all -- if the service is enabled and online or degraded, the dependency should not be started
If your service is dependent on a legacy script having run, we strongly recommend you either convert or encourage your vendor to convert the legacy script to an smf(5) service. Barring that, you can specify a dependency on the milestone that script is part of. This will never propagate errors from the legacy service, so only makes sense as a restart_on=none dependency.

Finally, since you did the hard work to determine why a certain dependency was required, write a comment to help future maintainers!
```
      
      <dependency
         name='nameservice'
         type='service'
         grouping='require_all'
         restart_on='none'>
            <service_fmri value='svc:/milestone/name-services' />
      <dependency>
   
```
Identify dependents

If you wish to deliver a service which is a dependency of another service that you don't deliver, you can specify this in your manifest without modifying the manifest you don't own. That is, dependent specifications are an easy way to have your service run before a service delivered by Sun.

If not all of your dependent services have been converted, you'll need to convert those too, as there is no way to specify a dependent on a legacy script.

To avoid conflicts, we recommend prefacing your dependent name with the name of your service.

For example, if you're delivering a service (mysvc) that must start before syslog, use the following:
```
      <dependent
         name='mysvc_syslog'
         grouping='optional_all'
         restart_on='none'>
            <service_fmri value='svc:/system/system-log' />
      <dependent>
   
```
Insert your service into a milestone

If your service was previously delivered into an rc?.d directory and other services might depend on you, you should make milestone corresponding to your previous delivery location a dependent.

For example, if your service was previously started at runlevel 2, this clause will make sure that runlevel 2 is not considered complete until your service has started.
```
   <dependent
      name='mysvc_multi-user'
      grouping='require_all'
      restart_on='none'>
      <service_fmri value='svc:/milestone/multi-user' />
   <dependent>
   
```
Create, if appropriate, a default instance

If your service doesn't require additional administrative intervention for configuration before it starts the first time, you should configure a default instance for your service.

If the instance has no configuration differences from the service, this can easily be done with:
```
<create_default_instance enabled='false' />
```
Alternatively, you can explicitly define the instance.
```
      <instance name='default' enabled='false'>
         
      </instance>
   
```
We recommend that all instances are delivered as disabled unless if they are critical to system boot. Customization can then be done by either the administrator or a profile (described elsewhere).
Create template information to describe your service

Document at least a common name in the C locale and a manpage reference. The common name should
- be short (40 characters or less),
- avoid capital letters aside from trademarks like Solaris,
- avoid punctuation, and
- avoid the word service (but do distinguish between client and server).
- Write/update an administrative command
  
  If your service already has an administrative command which stops, starts, or restarts your service, update it to use svcadm(1M), or libscf calls. If an administrative command explicitly starts a daemon outside of smf(5), the smf(5) framework won't know there are other daemons running. Conflicts between daemons, incorrect contracts, and lack of visibility using svcs(1) are among the problems that will occur.
- Remove your script from /etc/rc?.d locations and /etc/init.d
  
  If you don't remove your init script, it will still be run in legacy mode.

For more information

The DTD is self-documenting. Many questions can be resolved by just reading /usr/share/lib/xml/dtd/service_bundle.dtd on your Solaris 10 system.

Sun delivers many manifests in /var/svc/manifest. These may be used as templates and examples. A few to start with:

system/utmp is a simple standalone daemon,
system/coreadm is a simple transient service, and
network/telnet is an inetd(1M)-based daemon.

The following manpages are a helpful start: smf(5), smf_bootstrap(5), smf_method(5), smf_restarter(5), smf_security(5), svc.startd(1M), inetd(1M), inetconv(1M).

Finally, the SMF System Administration Guide Documentation, and Predictive Self-Healing site on Bigadmin are also useful sources of information.

Writing an `inetd` service manifest

Start with inetconv(1M), and include other modifications such as adding templates and refining the name.

More to come on this and:

packaging
removing pre-converted inetd.conf services

Coming soon:

Packaging
Installing your service
Logfiles
Testing your service
Upgrade considerations
Using smf properties for service configuration.
action_authorizations -- creating fine-grained administrative roles for specific service administration.