DCP-5
<< Back to the Discovery Change Proposals page
DCP-5: Valids and restrictions for query elements
Progress
- Submitted on: 24 Apr 2012
- Review period: 24 Apr 2012 to 24 May 2012
- Revision: TBD
- Vote: http://www.doodle.com/rw4kg7w6yxc5yfdf - to end 2012-06-01, 12pm EDT. ish.
- Final review: TBD
- Ratified / Rejected: TBD
- Facilitator: Ian Truslove (Ian Truslove (Truslove))
Description
Provide a mechanism to describe valid and invalid inputs for query elements in ESIP Discovery services.
Problem Addressed
OpenSearch does not have a mechanism for indicating valid options for search parameters. The OpenSearch Parameter extension (http://www.opensearch.org/Specifications/OpenSearch/Extensions/Parameter/1.0) has options for specifying the number of times a particular parameter may or should be included in the search request, but this does not meet our needs.
Proposed Solution
Add a validpatterns role to the OpenSearch <query> element, and specify valid values for attributes with a regular expression.
Example:
<OpenSearchDescription xmlns="http://a9.com/-/spec/opensearch/1.1/"> <Url type="application/atom+xml" template="http://somedataprovider.com/?q={searchTerms}&datum={datum?}&gridSize={gridSize}&format=atom"/> <Query role="example" searchTerms="map" datum="WGS84" gridSize="1km" /> <Query role="http://esipfed.org/ns/discovery/1.1/#validpatterns" searchTerms="WGS84|EGM96" gridSize="1m|10m|100m|500m|1km|10km" /> </OpenSearchDescription>
Rationale for the Solution
- Terse - a single parameter's options fit into a single attribute value.
- Well-understood - this is the same mechanism HTML5 uses for input field validation (http://www.whatwg.org/specs/web-apps/current-work/multipage/common-input-element-attributes.html#the-pattern-attribute). Implementors will undoubtedly be familiar with this web technology.
- Powerful - regular expressions are a powerful pattern matching mechanism.
Discussions
- There are a proliferation of regex "flavors" - see http://www.regular-expressions.info/refflavors.html. We should standardize on one of these flavors.
- The HTML5 specification uses the ECMAScript flavor of regex. This seems a reasonable, standardized regex standard to implement as part of DCP-5.
- Having now looked at the OS Parameters extension, I wonder if it makes sense to piggyback on that. The good thing there is that for each search parameter there is a <parameters:Parameter> tag that contains multiple attributes. One could add a 'pattern' attribute to contain the regular expression just as in the HTML <input> element. Or put the pattern attribute in a namespace if we can't get the author of the Parameters extension to agree.
- My (small) objection to the current proposal is that putting the actual search parameter names as many new attributes in a <Query> tag seems kind of sketchy (risky?). By having a separate tag for each one, there is always room for further extension using more attributes.
Update 2012-07-17
tl;dr: I think using namespaced OpenSearch parameter names (http://www.opensearch.org/Specifications/OpenSearch/1.1#Parameter_names) and making sure the semantics are appropriately described already provides a codified, standardized solution to this problem.
It occurred to me that using regexes helps with imperatively describing what valid values are, but goes nowhere to describe the semantics of the substitutions. Having regexes would allow service clients to perform client-side validation of template parameters, but that left me with the question "so what"?
Knowing the precise input format would let us make more customized UIs. Taking a concrete example, if the template parameter was ...?datum={datum} and the pattern was (NAD27|NAD83|WGS84) then our UI could replace a text input with a dropdown which only allows valid values. If the template parameter was ...?prominentColor={color}, and the pattern for "color" was a more standard regex like \#[0-9a-fA-F]{6}, option to the UI are limited, e.g. to just setting the pattern attribute on the input element.
If we knew more about the semantics of the template substitutions, we could still build our clients to be rich, but we might know a little more about why. There's an OpenSearch mechanism to namespace the template parameters (http://www.opensearch.org/Specifications/OpenSearch/1.1#Parameter_names), and if I read the specs correctly it would allow one to do something like
<url xmlns:geoDataApp="http://www.example.org/ns/apps/GeoPortal/1.0" template="http://www.example.org/services/geo_search?q={searchTerms}&datum={geoDataApp:datum} />
Then {datum} has a semantic basis. The service provider could publish human-readable documentation at the namespace URI about the meaning of the template fields, and regex patterns that limit valid inputs. If making client applications super-responsive to the API, another service could be exposed to present valid enumerations or regexes - but how that's done remain within the purview of the service provider - and not an ESIP or OpenSearch standard (though recommendations for how to document and expose such a service may well be useful).
A second example:
<url xmlns:geoDataApp="http://www.example.org/ns/apps/GeoPortal/1.0" template="http://www.example.org/services/geo_search?q={searchTerms}&backgroundColor={geoDataApp:color} />
If the namespace's intent for geoDataApp:color is that it is a valid color, not only could the UI configure itself to limit valid inputs to the regex above, the UI could also present a color picker widget, an eyedropper tool, whatever - by giving the substitution variable a meaning and having an understanding about how those meanings are communicated.
I think this approach is codified in the OpenSearch spec given a couple of references: http://www.opensearch.org/Specifications/OpenSearch/1.1#Parameter_names indicates that "In the case of unqualified parameter names, the local parameter name is implicitly associated with the OpenSearch 1.1 namespace", and a small number of parameters such as searchTerms are given a semantic context in the spec (http://www.opensearch.org/Specifications/OpenSearch/1.1#The_.22searchTerms.22_parameter). So, if I'm right in thinking that we can use namespaces and the like to solve our problem of needing a way to validate inputs, I think DCP-5 is moot.
I'm really interested to hear feedback. I'm wary of paper-engineering the heck out of this and coming up with a pragmatic approach (and I think regex patterns just how Brian suggested are pragmatic), but at the same time I think that the problem has already been solved in the OpenSearch spec.
- Ian Truslove (Truslove) (talk) 17:14, 17 July 2012 (MDT)
Consensus
TBD
Voting results
TBD