Organize Data for Epidemiological Studies

What is HTML DataBook?

HTML DataBook is a set of interlinked web pages that reports the following information for an epidemiologic study:

  1. Which datasets are collected? How many subjects in each dataset?
  2. Which variables are in each dataset?
  3. What data is available for each individual subject?
  4. For each dataset, what is the data distribution (mean, sd and percentiles, or frequency) of each variable?

Click here for an Example HTML DataBook and click here for an explanation of the each table in the sample HTML DataBook

Why use HTML Databook?

A large research study usually collects data from multiple sources, such as questionnaires, physical exams, medical record reviews, laboratory tests, and so on.
Data from each source is usually saved in separate datasets.  Empower DataBook provides an overview of datasets and the contents of each dataset, which will help investigators:

  1. Quickly get familiar with the study contents, datasets and variables
  2. Generate hypothesis and data analysis plans
  3. Share information and collaborate with others

Try Empower DataBook
Organize, Discover, Collaborate
Download Now >

How to use Empower DataBook

Simply input the Project Name and the Data Directory, then click “Create Databook” button
Below is a screenshot of Empower Databook Input window:


  1. Organize all datasets of a study into one directory
    • your data files could be SAS datasets or text files (tab delimited, or comma delimited, or space delimited
    • your SAS datasets could also be located in local PC or unix server
    • If you are using tab, comma or space delimited text files, your data files should be saved in local PC
  2. Optional input information
    • Subject ID variable:   if you have a common variable for identifying each subject of the study in each data files, enter this variable name.  With subject ID variable specified, Empower DataBook can report number of subject in each test item (table 2 and table 3) and test items for each subject (table 4)
    • Project title is optional, it is a short description of the project
    • Data files’ description is optional. If data files located in local PC, Empower will automatically search the data directory and list all data files. If data files located in remote server (SAS datasets only), Empower DataBook will write SAS code automatically detect all datasets in the directory
    • Data document file is optional.  For each data file, if you have the questionnaire or record sheet or any other documents, you can save it as a .pdf file with same name as the dataset name (eg. ques1.pdf is the document file for ques1.sas7bdat) and put it in the data directory.  These .pdf file will automatically be linked to the report pages


Run SAS in Unix: If your data files are SAS dataset and are located in a unix server, Empower DataBook will ask you to set up SFTP/Putty connection parameter first, then Empower DataBook will automatically upload SAS code it created to Unix server and execute it. When SAS is finished executing in the server, you can download the HTML DataBook files to local PC for review. 

Try Empower DataBook
Organize, Discover, Collaborate
Download Now >