Repository Filters¶
pycsw has the ability to perform server side repository / database filters as a means to mask all CSW requests to query against a specific subset of the metadata repository, thus providing the ability to deploy multiple pycsw instances pointing to the same database in different ways via the repository.filter configuration option.
Repository filters are a convenient way to subset your repository at the server level without the hassle of creating proper database views. For large repositories, it may be better to subset at the database level for performance.
Scenario: One Database, Many Views¶
Imagine a sample database table of records (subset below for brevity):
identifier | parentidentifier | title | abstract |
---|---|---|---|
1 | 33 | foo1 | bar1 |
2 | 33 | foo2 | bar2 |
3 | 55 | foo3 | bar3 |
4 | 55 | foo1 | bar1 |
5 | 21 | foo5 | bar5 |
5 | 21 | foo6 | bar6 |
A default pycsw instance (with no repository.filters option) will always process CSW requests against the entire table. So a CSW GetRecords filter like:
<ogc:Filter>
<ogc:PropertyIsEqualTo>
<ogc:PropertyName>apiso:Title</ogc:PropertyName>
<ogc:Literal>foo1</ogc:Literal>
</ogc:PropertyIsEqualTo>
</ogc:Filter>
...will return:
identifier | parentidentifier | title | abstract |
---|---|---|---|
1 | 33 | foo1 | bar1 |
4 | 55 | foo1 | bar1 |
Suppose you wanted to deploy another pycsw instance which serves metadata from the same database, but only from a specific subset. Here we set the repository.filter option:
[repository]
database=sqlite:///records.db
filter=pycsw:ParentIdentifier = '33'
The same CSW GetRecords filter as per above then yields the following results:
identifier | parentidentifier | title | abstract |
---|---|---|---|
1 | 33 | foo1 | bar1 |
Another example:
[repository]
database=sqlite:///records.db
filter=pycsw:ParentIdentifier != '33'
The same CSW GetRecords filter as per above then yields the following results:
identifier | parentidentifier | title | abstract |
---|---|---|---|
4 | 55 | foo1 | bar1 |
The repository.filter option accepts all core queryables set in the pycsw core model (see pycsw.config.StaticContext.md_core_model for the complete list).