1. Cohort & dataset builder

If you need a concept but you don’t know how to do that, you can use the cohort builder to create sample criteria and then preview the resulting Jupyter code (click ‘preview code’) to view the variable names and ranges.

2. Write an optimized query

Avoid using “SELECT*” (see below). This function selects all, which means you use a lot of computing power and it takes a long time to load, i.e. it is not useful and also expensive.

...

Finally, add filters whenever you can to reduce the amount of data you use/load.

See examples below.

...

3. Save data from super long query

Data get deleted after about a week if you don’t save them. Save your data in a bucket and later run that bucket.

...

If you use snippets, you always have to run the corresponding “Setup” first. Then, select the snippet “List buckets” → then “copy file from workspace bucket”.

...

5. Restart & run all

6. Save your most recent notebook version

The snippet for saving your notebook version is only available in Python, not in R.

I didn’t quite follow this part.

Notes for my study:

to account for dataset changes/update, run this first:

import os

dataset_name = os.getenv('WORKSPACE_CDR')

dataset_name

To get the date difference: OSA x HTN: condition_start_state

...

Versions Compared

Old Version 2

New Version Current

Key

1. Cohort & dataset builder

2. Write an optimized query

3. Save data from super long query

5. Restart & run all

6. Save your most recent notebook version

Notes for my study:

import os

dataset_name = os.getenv('WORKSPACE_CDR')

dataset_name

Page Comparison

Versions Compared

Old Version 2

New Version Current

Key

1. Cohort & dataset builder

2. Write an optimized query

3. Save data from super long query

5. Restart & run all

6. Save your most recent notebook version

Notes for my study:

import os

dataset_name = os.getenv('WORKSPACE_CDR')

dataset_name