otc
Stylish Dinosaur
- Joined
- Aug 15, 2008
- Messages
- 24,539
- Reaction score
- 19,196
Sounds about right.
Our sas server forces a structure on users.
Client
-Raw Data (stuff in here gets auto-zipped if not accessed in a certain amount of time)
-SAS Data (only sas datasets, default is read/write access for everyone)
---subfolders here for sub-projects
---different sources
---personal interim data/etc
-SAS Programs (Subfolders created for any user who logs into that client with read only access for other users)
---My folder
----Programs sit here
----Optional subfolders (if project gets long and I want to archive stuff away/keep a specific version/etc
----Output (where I can freely dump temporary PDF/excel type outputs for viewing/emailing without caring about overwriting anything)
-Stata Data
-Stata Programs (similar structure for stata but I don't use it so I don't really pay attention.
Beyond that, organization is left up to the user...I typically create my output folder to dump stuff in. Raw Data doesn't always hold the raw data...we have a staff dedicated to converting data, so sometimes they have the originals (but then usually the SAS datasets can be considered exact transcriptions of the raw data). If data cleaning leads to a new dataset (rather than just a block of code that does the cleaning when reading it in), that will usually end up in a different folder. I often end up with my own folder in the SAS data directory for stuff I am playing with where I might want to write out a permanent dataset.
The system keeps people well enough organized. It's not perfect--organization within a user's folder might be terrible, and even if I create my "own" data folder, people might end up saving their datasets in there--, but when you are talking about people who aren't formally trained programmers and when there might be 40 different people who have logged in to a client and created a program, it at least means you can tell who did what and have a decent chance of finding someone else's code if you need it.
Unfortunately, it is not conducive to version control, which I would prefer. By default, the server cleans up files it doesn't recognize (often zipping unused non-sas-program files to save space) and fiddles with permissions, which makes dropping something like a mercurial repository on the server impossible.
Edit: and I guess I should say, I usually use SAS through Enterprise Guide (although I don't use any of its automated features...just use it to edit code). EG uses project files, so instead of storing a lot of random programs in the directory, I store a project file that contains the code. I might have more than one of these for unrelated tasks that don't share data...and I might make a one-off version when a report goes out that contains only the code needed to produce the numbers in the report (a sort of ghetto version control).
Our sas server forces a structure on users.
Client
-Raw Data (stuff in here gets auto-zipped if not accessed in a certain amount of time)
-SAS Data (only sas datasets, default is read/write access for everyone)
---subfolders here for sub-projects
---different sources
---personal interim data/etc
-SAS Programs (Subfolders created for any user who logs into that client with read only access for other users)
---My folder
----Programs sit here
----Optional subfolders (if project gets long and I want to archive stuff away/keep a specific version/etc
----Output (where I can freely dump temporary PDF/excel type outputs for viewing/emailing without caring about overwriting anything)
-Stata Data
-Stata Programs (similar structure for stata but I don't use it so I don't really pay attention.
Beyond that, organization is left up to the user...I typically create my output folder to dump stuff in. Raw Data doesn't always hold the raw data...we have a staff dedicated to converting data, so sometimes they have the originals (but then usually the SAS datasets can be considered exact transcriptions of the raw data). If data cleaning leads to a new dataset (rather than just a block of code that does the cleaning when reading it in), that will usually end up in a different folder. I often end up with my own folder in the SAS data directory for stuff I am playing with where I might want to write out a permanent dataset.
The system keeps people well enough organized. It's not perfect--organization within a user's folder might be terrible, and even if I create my "own" data folder, people might end up saving their datasets in there--, but when you are talking about people who aren't formally trained programmers and when there might be 40 different people who have logged in to a client and created a program, it at least means you can tell who did what and have a decent chance of finding someone else's code if you need it.
Unfortunately, it is not conducive to version control, which I would prefer. By default, the server cleans up files it doesn't recognize (often zipping unused non-sas-program files to save space) and fiddles with permissions, which makes dropping something like a mercurial repository on the server impossible.
Edit: and I guess I should say, I usually use SAS through Enterprise Guide (although I don't use any of its automated features...just use it to edit code). EG uses project files, so instead of storing a lot of random programs in the directory, I store a project file that contains the code. I might have more than one of these for unrelated tasks that don't share data...and I might make a one-off version when a report goes out that contains only the code needed to produce the numbers in the report (a sort of ghetto version control).
Last edited: