Welcome to NSRR Cross Dataset Query Interface

The goal of NSRR cross dataset query interface is to be directly used by clinical researchers, for activities such as data exploration seeking to formulate, clarify, and determine the availability of support for potential hypotheses as well as for cohort identification across multiple datasets. It has five main components:
  1. Query Builder, with terminology support and visual controls such as slider bar and checkboxes to construct query criteria;
  2. Query Manager, which stores and labels queries for reuse;
  3. Graphical Exploration, which allows exploration of graphical distribution for core terms across multiple datasets;
  4. Case-Control Exploration, which supports both within and cross dataset frequency matching for case-control exploration;
  5. Case-Control Manager, which saves and labels case-control explorations for reuse.


Clicking the sub menu QUERY BUILDER under the menu QUERY will lead you to the Query Builder page[Fig01].

  • In the "Area for data sets selection" (blue box 1), you can select datasets of interest such as SHHS (Sleep Heart Health Study) and CHAT (Childhood Adenotonsillectomy Study).
  • In the "Area for looking up core terms" (blue box 2), you can look up a specific core term in two ways. The first way is "browse" (red circle), where you can browse core terms in terms of categories. For example, clicking the first category "01 - Demographics" (Arrow A) will pop up a sub menu with all the core terms (e.g., "Age", "Ethnicity", "Gender", "Race") or sub categories under this category. The second way is "search" (red circle in Fig02), where you can search for core terms like "gender" in the text box. Then a list of candidate variable terms, which partially match the search term, will be automatically displayed.
  • Clicking an appropriate candidate term (Arrow B or Arrow C in Fig01 or Arrow D in Fig02) will add a corresponding individual query widget to the "Area for query composition" (blue box 3), where you can specify the query based on the type of the term and dataset variables mapped to this term. If the term is a categorical type, then checkboxes will be provided for you to select. If the term is a numeric type, then a slider bar along with two text boxes specifying range will be provided so that you can either slide the bar or enter a range in the text boxes.
  • Clicking the "Query" button on the bottom left corner of the "Area for query composition" (blue box 3) will generate the number of unique subjects satisfying the composed query criteria. For example, both Fig01 and Fig02 show a sample query for unique number of female subjects in SHHS and CHAT with age between 10 and 50.
Querybuilder browse

Querybuilder search

While you are exploring or done with the query composition, you can also scroll down and save the constructed query[Fig03]. Be sure to sign up or sign in before saving a query. The Name and Description of the query are automatically generated for your convenience, and you can customize them as you want.

Savequery

Clicking the "Save Query" button in Fig03 will lead you to the Query Manager page[Fig04], which allows you to keep track of saved queries and retrieve record information on the queries. You can also click the sub menu QUERY MANAGER under the menu QUERY or click the tab Query Manager to get to the Query Manager page[Fig04]. Clicking the name of a saved query (blue box) will link to the Query Builder page with query widgets automatically loaded for the saved query, where you can review and update query criteria.

Querymanager

There are two ways to explore graphical distribution of multiple datasets according to a single core term and any two core terms.

The first way is to view graphical distribution for a single core term across all the dataset variables mapping to it. Box plots are generated for numeric terms and bar plots for categorical terms. For example[Fig05], in the Query Builder page, clicking the core term "Body mass index" under the category "02 - Anthropometry" (blue arrow) will generate a query widget, where you can click the button "View Box Plots" (red circle) to view the box plots[Fig06] generated for each dataset variable mapping to the core term "Body mass index".

Bmi browse

As shown in Fig06, there are four box plots corresponding to four variables in datasets SHHS and CHAT mapping to the core term "Body mass index". The summary statistics of these variables are also reported in the table below the box plots. Hovering over a box plot will show the summary statistics for the corresponding variable.

Bmi plot

The second way is to view graphical distribution by choosing any two core terms. For example, clicking the sub menu GRAPHICAL EXPLORATION under the menu QUERY or clicking the tab Graphical Exploration will lead you to the Graphical Exploration page[Fig07]. You will need to choose one core term for X axis and another for Y axis (e.g., Race for X axis and Systolic blood pressure for Y axis in blue boxes). Box plots will be generated if a numeric core term is chosen for Y axis and bar plots will be given if a categorical core term is chosen for Y axis.

Xy dropdown

Clicking the button "Graph Distribution" (red circle in Fig07) will generate multiple graphs[Fig08]. Each graph in Fig08 is rendered for a variable in a dataset mapping to Y axis against a variable mapping to X axis in the same dataset. For example, the first graph in Fig08 shows the box plots for the baseline-visit variable "Average Systolic BP" in the dataset SHHS against the baseline-visit variable "Race" in SHHS. Here each box plot is rendered for the "Average Systolic BP" data under a specific option of "Race" (e.g., White, Black, or Other).

Xy plot

Clicking the sub menu CASE-CONTROL EXPLORATION under the menu QUERY or clicking the tab Case-Control Exploration will lead you to the Case-Control Exploration page[Fig09]. The workflow to explore case-control matching consists of five steps:

  1. Set Base Query Terms. This step allows users to specify query criteria for base terms (e.g., age, gender) to limit the population group for the potential case-control study.
  2. Set Case Condition. This step asks users to specify a query condition for cases. A popup window[Fig10] will be displayed once you click "Set Case Condition" button for setting up a condition for cases.
  3. Set Control Condition. This step allows users to specify the condition for controls, which uniquely determines the distinction with cases.
  4. Set Match Terms. This step allows users to set up match terms (e.g., age, race) for the potential case-control study.
  5. Set Outcome Terms. This step enables users to set outcome terms in order to explore potential association between outcome terms and the condition distinguishing cases and controls.

Case control workflow
Case condition

Fig11 shows an exmaple with all conditions in the above steps specified. For each numeric match term or numeric outcome term, users can configure bin size[Fig12] to control the rendering of the results. Then, you can choose exploration type[Fig09] to be "Cross Dataset" or "Within Dataset." Clicking the "Matching" button (in Fig09) will render the result in a multi-column table format[Fig12]. In addition, the matching result can be downloaded and saved as the Comma-Separated Values (CSV) format.

Case control all conditions
Case control result

Clicking the "Save Exploration" button in Fig10 will lead you to the Case-Control Manager page[Fig13], which allows you to keep track of saved case-control explorations. You can also click the sub menu CASE-CONTROL MANAGER under the menu QUERY or click the tab Case-Control Manager to get to the Case-Control Manager page[Fig13]. Clicking the name of a saved case-control exploration (blue box) will link to the Case-Control Exploration page with the saved exploration automatically loaded, where you can review and update the exploration.

Case control manager

Ready to start building your own query or exploring graphs? Build Query Graphical Exploration