Configuration¶
pycsw’s runtime configuration is defined by default.cfg
. pycsw ships with a sample configuration (default-sample.cfg
). Copy the file to default.cfg
and edit the following:
[server]
- home: the full filesystem path to pycsw
- url: the URL of the resulting service
- mimetype: the MIME type when returning HTTP responses
- language: the ISO 639-1 language and ISO 3166-1 alpha2 country code of the service (e.g.
en-CA
,fr-CA
,en-US
) - encoding: the content type encoding (e.g.
ISO-8859-1
, see https://docs.python.org/2/library/codecs.html#standard-encodings). Default value is ‘UTF-8’ - maxrecords: the maximum number of records to return by default. This value is enforced if a CSW’s client’s
maxRecords
parameter is greater thanserver.maxrecords
to limit capacity. See MaxRecords Handling for more information - loglevel: the logging level (see http://docs.python.org/library/logging.html#logging-levels)
- logfile: the full file path to the logfile
- ogc_schemas_base: base URL of OGC XML schemas tree file structure (default is http://schemas.opengis.net)
- federatedcatalogues: comma delimited list of CSW endpoints to be used for distributed searching, if requested by the client (see Distributed Searching)
- pretty_print: whether to pretty print the output (
true
orfalse
). Default isfalse
- gzip_compresslevel: gzip compression level, lowest is
1
, highest is9
. Default is off - domainquerytype: for GetDomain operations, how to output domain values. Accepted values are
list
andrange
(min/max). Default islist
- domaincounts: for GetDomain operations, whether to provide frequency counts for values. Accepted values are
true
andFalse
. Default isfalse
- profiles: comma delimited list of profiles to load at runtime (default is none). See Profile Plugins
- smtp_host: SMTP host for processing
csw:ResponseHandler
parameter via outgoing email requests (default islocalhost
) - spatial_ranking: parameter that enables (
true
orfalse
) ranking of spatial query results as per K.J. Lanfear 2006 - A Spatial Overlay Ranking Method for a Geospatial Search of Text Objects.
[manager]
- transactions: whether to enable transactions (
true
orfalse
). Default isfalse
(see Transactions) - allowed_ips: comma delimited list of IP addresses (e.g. 192.168.0.103), wildcards (e.g. 192.168.0.*) or CIDR notations (e.g. 192.168.100.0/24) allowed to perform transactions (see Transactions)
- csw_harvest_pagesize: when harvesting other CSW servers, the number of records per request to page by (default is 10)
[metadata:main]
- identification_title: the title of the service
- identification_abstract: some descriptive text about the service
- identification_keywords: comma delimited list of keywords about the service
- identification_keywords_type: keyword type as per the ISO 19115 MD_KeywordTypeCode codelist). Accepted values are
discipline
,temporal
,place
,theme
,stratum
- identification_fees: fees associated with the service
- identification_accessconstraints: access constraints associated with the service
- provider_name: the name of the service provider
- provider_url: the URL of the service provider
- contact_name: the name of the provider contact
- contact_position: the position title of the provider contact
- contact_address: the address of the provider contact
- contact_city: the city of the provider contact
- contact_stateorprovince: the province or territory of the provider contact
- contact_postalcode: the postal code of the provider contact
- contact_country: the country of the provider contact
- contact_phone: the phone number of the provider contact
- contact_fax: the facsimile number of the provider contact
- contact_email: the email address of the provider contact
- contact_url: the URL to more information about the provider contact
- contact_hours: the hours of service to contact the provider
- contact_instructions: the how to contact the provider contact
- contact_role: the role of the provider contact as per the ISO 19115 CI_RoleCode codelist). Accepted values are
author
,processor
,publisher
,custodian
,pointOfContact
,distributor
,user
,resourceProvider
,originator
,owner
,principalInvestigator
[repository]
- database: the full file path to the metadata database, in database URL format (see http://docs.sqlalchemy.org/en/latest/core/engines.html#database-urls)
- table: the table name for metadata records (default is
records
). If you are using PostgreSQL with a DB schema other thanpublic
, qualify the table likemyschema.table
- mappings: custom repository mappings (see Mapping to an Existing Repository)
- source: the source of this repository only if not local (e.g. GeoNode Configuration, Open Data Catalog Configuration). Supported values are
geonode
,odc
- filter: server side database filter to apply as mask to all CSW requests (see Repository Filters)
Note
See Administration for connecting your metadata repository and supported information models.
MaxRecords Handling¶
The The following describes how maxRecords
is handled by the configuration when handling GetRecords
requests:
server.maxrecords | GetRecords.maxRecords | Result |
---|---|---|
none set | none passed | 10 (CSW default) |
20 | 14 | 20 |
20 | none passed | 20 |
none set | 100 | 100 |
20 | 200 | 20 |
Using environment variables in configuration files¶
pycsw configuration supports using system environment variables, which can be helpful for deploying into 12 factor environments for example.
Below is an example of how to integrate system environment variables in pycsw:
[repository]
database=${PYCSW_REPOSITORY_DATABASE_URI}
table=${MY_TABLE}
Alternate Configurations¶
By default, pycsw loads default.cfg
at runtime. To load an alternate configuration, modify csw.py
to point to the desired configuration. Alternatively, pycsw supports explicitly specifiying a configuration by appending config=/path/to/default.cfg
to the base URL of the service (e.g. http://localhost/pycsw/csw.py?config=tests/suites/default/default.cfg&service=CSW&version=2.0.2&request=GetCapabilities
). When the config
parameter is passed by a CSW client, pycsw will override the default configuration location and subsequent settings with those of the specified configuration.
This also provides the functionality to deploy numerous CSW servers with a single pycsw installation.
Hiding the Location¶
Some deployments with alternate configurations prefer not to advertise the base URL with the config=
approach. In this case, there are many options to advertise the base URL.
Environment Variables¶
Configuration file location¶
One option is using Apache’s Alias
and SetEnvIf
directives. For example, given the base URL http://localhost/pycsw/csw.py?config=foo.cfg
, set the following in Apache’s httpd.conf
:
Alias /pycsw/csw-foo.py /var/www/pycsw/csw.py
SetEnvIf Request_URI "/pycsw/csw-foo.py" PYCSW_CONFIG=/var/www/pycsw/csw-foo.cfg
Note
Apache must be restarted after changes to httpd.conf
pycsw will use the configuration as set in the PYCSW_CONFIG
environment variable in the same manner as if it was specified in the base URL. Note that the configuration value server.url
value must match the Request_URI
value so as to advertise correctly in pycsw’s Capabilities XML.
Wrapper Script¶
Another option is to write a simple wrapper (e.g. csw-foo.sh
), which provides the same functionality and can be deployed without restarting Apache:
#!/bin/sh
export PYCSW_CONFIG=/var/www/pycsw/csw-foo.cfg
/var/www/pycsw/csw.py