Data, especially real-world medical data, is eminently sensitive. What would happen if all the GPS tracking data of a research project would become public, along with identifiable information? Can data that has been polluted by a hacker still be published? And finally, can data from different sources easily be merged or at least compared? These questions take a multi-pronged approach, from dedicated user management to the delegation of authorization to data handling services. One of the ways to answer them is by setting up a dedicated user management service and delegating authorization to the data handling solutions.
User management in modern IT systems
User management is an essential part of modern, complex IT systems. Simply put, user management is an authentication feature that enables the administrators of a website, a server, or an application to identify and control the state of users logged into a network. This includes the ability to provide new users with different levels of access, control user login counts and login times, or filter users that are currently logged into the network.
Cookie-based and token-based authentication
The main to user authentication are the traditional cookie-based authentication and a more modern token-based authentication. The traditional approach relies on session IDs which are stored on the server. That implies that developers have to implement session storage unique to the server (meaning extra development effort) and that servers storing session IDs might get overloaded and slow due to the large number of parallel sessions.
On the other hand, the modern token-based approach relies on the usage of tokens instead of session IDs, and since all the information relevant for the authentication is stored in the token itself, it does not need to be stored for the duration of the session. This also simplifies user management in complex systems with multiple servers and applications.
A good analogy to illustrate these two authentication approaches is the registration and access of participants that takes place at a conference or convention. You would have to register at the entrance at the beginning of the first day and validate your identity with your government-issued ID (creating a new user), after which you get a participant badge (authentication completed) and can enter the conference floor. When you return to the conference the next day (a new session) or try to attend a parallel session (an authorisation request), your badge, and not your actual ID is checked.
When translated to the cookie-based authentication, in case of the cookie-based authentication each security guard would need a list of all attendees and their registrations, and would have to check that that the badge corresponds to one of the registrations every time a participant enters not just the conference floor, but every new area or session.
In the context of tokens, you are identified and authorised by every gatekeeper based on the presence of the right badge around your neck, as the gatekeeper trusts that your badge is a proof of your access to the conference floor.
Identity provider, JSON Web Token and OAuth 2.0
Token-based authentication and user management uses JSON Web Tokens, or JWT. In computer systems, an access token is an object that contains the security credentials for a login session and identifies the user and user-related information, e.g., user’s privileges.
JWT is a JSON-based open standard for creating access tokens, which defines a way to securely transmit information between parties. This information can be verified and trusted by applications or servers because they recognise such format as "digitally signed” or “digitally verified”.
JWT are granted to the user by an Identity Provider, a system that has the rights to create users and grant them with certain level of access. For example, when patients participate in a clinical trial, the clinical trial manager will register every patient and their doctors in a system coupled with the clinical trial. The manager will register patients as users of the system, give patients access to the system at the level of submitting some of their data (e.g., entering answers to questionnaires). The doctors will be able to add extra patient information (e.g., blood test results) for their own patients, as well as download data of all patients participating in the study. In this example, the system containing patients’ and doctors’ information and access levels plays the role of the Identity provider.
Different levels of access, or levels of authority, are assigned through claims — any statements made by the users: “I am user X”, “I have access to server Y”, “I can use service Z from server Y”. The validity of the claim is checked once by the identity provider, and when the validity is verified, this claim can be used with any of the services (servers and apps, or clients) from the list of services trusted by the identity provider. The level of trust can be determined in the description of the client and does not have to be automatic. In some scenarios where extra security and validation is required, it may be specified that an actual person needs to manually log in and say “I trust this client” in order for the identity provider to verify and trust this client.
The identity provider provides a way to decouple identifiable information from a user. It can simply hand out a single anonymous user ID per user that all data handling services can use to annotate the data. This way, there is an easy way to merge data from multiple sources but from a single user at a later stage. Conversely, if the privacy aspect is deemed more important, an identity provider can hand out a unique anonymous ID for a user per client, so that clients cannot cross-reference data collected by other means.
The industry-standard protocol for authorisation is covered by OAuth 2.0, a protocol which provides authorisation flows for web and desktop applications, mobile phones and even living room devices.
User management for real world data
Token-based authentication and user management are crucial in complex IT systems, e.g., with a variety of clients (servers and apps). Platforms with a variety of clients (and different groups of users) are required when working with real world data:
- wearable devices need to be connected to the end point collecting the data;
- raw collected data needs to be stored on one server;
- aggregated collected data may need to be stored on another server;
- collected data needs to be exported to another server or individual computer;
- patients and doctors may have different access to the data;
- even more so, different groups of doctors may have different levels of access to the data;
- and many more...
As tokens provide a uniform way to let all these clients communicate with the different services, token-based user management becomes especially beneficial for such platforms with large clients span.
The decoupling of user authentication from user queries and requests also helps the performance of the platform. As soon as the token is given, the identity provider is not required in the process anymore. As a bonus advantage of such physical decoupling -- the fact that user authentication and user queries are performed on two different servers -- comes extra security. When one server is used to store real names and their corresponding identifiers, and another server contains the data linked only to the identifiers, both of the servers have to be hacked in order to get patients’ personalized data. Ideally, the identity provider resides on an extra-secured dedicated node to limit the attack surface.
Would you like to know how we implemented user management in the RADAR-CNS project, for which we collect and process wearable sensor data? Leave a reply below or:
- Read about our effort in RADAR-CNS
- Read about the wearable collection platform we are working on
- Read about RADAR-base solution