A right to data is meaningless unless people know where to find it, and we support the idea of creating a central, user-friendly catalogue or inventory of all the information available. This, in our opinion, should be:
For comparisons, include data from more fragmented organisations, such as local authorities or the police, in addition to data from central government datasets.
Consider making it easier for people to find your content via search engines.
Make sure to include information about how frequently the data is updated and how fresh the data is (e.g., is it one-time or published quarterly – with the date of the next publication).
– If the data is part of a historical series, include links to related data sets (or with other breakdowns such as regions or relevant agencies)
– A user-friendly format as well as the raw format, where possible, should be included.
Datasets should be able to be ranked based on user interest and value, and relevant applications should be able to be posted. It is possible to reduce the time and resources needed to manage and maintain data inventories by employing such strategies.
Additionally, the government may want to use visualisation tools to draw in new users by emphasising, pointing out, or otherwise drawing attention to important aspects of the system.
If the standards are reasonable, the cost of compliance is low, and compliance itself does not cause additional interoperability issues, individuals (especially the specialist developer group) will comply. Consistent schemas for specific data sets (such as bus timetables) should be encouraged and ensured interoperability with other related data sets (for example train timetables). There should be links to the definitions of each schema on the same website as the catalogue. Additional benefits include improved navigability, usability, and interoperability by having consistent master data across all relevant Government datasets (for example, around the naming of hospitals or train stations). The availability of key data sets on government servers (and the ease with which they can be interrogated programmatically) would be another way to improve usability. Individual users do not have to download the entire data set when the information is regularly updated, making it easier to build mobile phone applications. Using a private API key for each service user may be a cost-effective way to control the number of queries that can be made by an individual service, similar to how Google Maps restricts the number of queries that can be made by an individual service. The government should also adopt and communicate to its employees clear measures for information governance in order to achieve compliance and ensure usability and interoperability. For public and private organisations, we have developed information governance frameworks that can be applied to US public services and the open data agenda based on our extensive experience. The framework for information governance examines ways to keep data private, confidential, secure, of high quality, and intact. In order to achieve usability and interoperability, two of these areas must be addressed:
To enhance the quality of data, strict data hygiene standards should be implemented. Assuring data quality is a major challenge, especially in complex environments with multiple IT systems that do not all share common technical, data, communication, or terminology standards.. Establishing standard interfaces and models for IT subsystems will help ensure data quality in these environments. Key components of an effective system architecture include:
Automated and manual processes for detecting and correcting errors in data. To improve data quality, public service professionals should be encouraged and rewarded for recognising the consequences of poor data quality and for changing their behaviour to improve data quality over time.
– Data validation rules that ensure that data meets a set of standards for format, quality, integrity, accuracy, and structure.
Improve data quality by using open standards for data recording and coding across multiple component systems, rather than proprietary ones.