User Development Papers A-Z

A
Session SAS0727-2017:
Accessibility and SAS® University Edition: Tips for Students and Professors
Accessibility has become a hot topic on campus due to a flurry of recent investigations of discrimination against students with disabilities by the U.S. Department of Justice and the U.S. Department of Education. This paper provides an update on the latest improvements in SAS® University Edition that are specifically targeted to enable students with disabilities to excel in the classroom and beyond. This paper covers the entire SAS University Edition user experience including installation, documentation, training, support, using SAS® Studio, and the new accessibility features in the fourth maintenance release of SAS® 9.4.
Read the paper (PDF)
Ed Summers, SAS
Amy Peters, SAS
Session 0930-2017:
Advanced Programming Techniques with PROC SQL
The SQL procedure has a number of powerful and elegant language features for SQL users. This hands-on workshop emphasizes highly valuable and widely usable advanced programming techniques that will help users of Base SAS® harness the power of PROC SQL. Topics include using PROC SQL to identify FIRST.row, LAST.row, and Between.rows in BY-group processing; constructing and searching the contents of a value-list macro variable for a specific value; data validation operations using various integrity constraints; data summary operations to process down rows and across columns; and using the MSGLEVEL= system option and _METHOD SQL option to capture vital processing and the algorithm selected and used by the optimizer when processing a query.
Read the paper (PDF) | Download the data file (ZIP)
Kirk Paul Lafler, Software Intelligence Corporation
Session 1260-2017:
Analyzing the Effect of Weather on Uber Ridership
Uber has changed the face of taxi ridership, making it more convenient and comfortable for riders. But, there are times when customers are dissatisfied because of a shortage of Uber vehicles, which ultimately leads to Uber surge pricing. It's a very difficult task to forecast the number of riders at different locations in a city at different points in time. This gets more complicated with changes in weather. In this paper, we attempt to estimate the number of trips per borough on a daily basis in New York City. We add an exogenous factor weather to this analysis to see how it impacts the changes in the number of trips. We fetched six months worth of data (approximately 9.7 million records) of Uber rides in New York City ranging from January 2015 to June 2015 from GitHub. We gathered weather data (about 3.5 million records) for New York City for the same period from the National Climatic Data Center. We analyzed Uber data and weather data together to estimate the change in the number of trips per borough due to changing weather conditions. We built a model to predict the number of trips per day for a one-week-ahead forecast for each borough of New York City. As part of a further analysis, we got the number of trips on a particular day for each borough. Using time series analysis, we forecast the number of trips that might be required in the near future (probably one week).
Read the paper (PDF) | View the e-poster or slides (PDF)
Anusha Mamillapalli, Oklahoma State University
Singdha Gutha, Oklahoma State University
Session 0968-2017:
Association between Sunlight and Specific-Cause Mortality
Research frequently shows that exposure to sunlight contributes to non-melanoma skin cancer. But, it also shows that sunlight might protect you against multiple sclerosis and breast, ovarian, prostate, and colon cancer. In my study, I explored whether mortality from skin cancer, myocardial infarction, atrial fibrillation, and stroke is associated with exposure to sunlight. I used SAS® 9.4 and RStudio to conduct the entire study. I collected mortality data including cause of death in Los Angeles from 2000 to 2003. In addition, I collected sunlight data for Los Angeles for the same period. There are three types of sunlight in my data global sunlight, diffuse sunlight, and direct sunlight. Data was collected at three different times morning, middle of day, and afternoon. I used two models the Poisson time series regression model and a logistic regression model to investigate the association. I considered a one-year and two-year lag of sunlight association with the types of diseases. I adjusted for age, sex, race, education, temperature, and day of week. Results show that stroke is statistically and significantly associated with a one-year lag of sunlight (p<0.001). Previous epidemiological studies have found that sunlight exposure can ameliorate osteoporosis in stroke patients, and my study provides the protective effects of sunlight on stroke patients.
View the e-poster or slides (PDF)
Wei Xiong, University of Southern California
Session 0836-2017:
Automate Validation of CDISC SDTM with SAS®
There are many good validation tools for Clinical Data Interchange Standards Consortium (CDISC) Study Data Tabulation Model (SDTM) such as Pinnacle 21. However, the power and customizability of SAS® provide an effective tool for validating SDTM data sets used in clinical trials FDA submissions. This paper presents three distinct methods of using SAS to validate the transformation from Electronic Data Capture (EDC) data into CDISC SDTM format. This includes: duplicate programming, an independent SAS program used to transform EDC data with PROC COMPARE; rules checker, a SAS program to verify a specific SDTM or regulatory rules applied to SDTM SAS data sets; and transformation validation, a SAS macro used to compare EDC data and SDTM using PROC FREQ to identify outliers. The three examples illustrate the diverse approaches to applying SAS programs to catch errors in data standard compliance or identify inconsistencies that would otherwise be missed by other general purpose utilities. The stakes are high when preparing for an FDA submission. Catching errors in SDTM during validation prior to a submission can mean the difference between success or failure for a drug or medical device.
Read the paper (PDF)
Sy Truong, Pharmacyclics
B
Session 0175-2017:
Best-Practice Programming Techniques Using SAS® Software
It's essential that SAS® users enhance their skills to implement best-practice programming techniques when using Base SAS® software. This presentation illustrates core concepts with examples to ensure that code is readable, clearly written, understandable, structured, portable, and maintainable. Attendees learn how to apply good programming techniques including implementing naming conventions for data sets, variables, programs, and libraries; code appearance and structure using modular design, logic scenarios, controlled loops, subroutines and embedded control flow; code compatibility and portability across applications and operating platforms; developing readable code and program documentation; applying statements, options, and definitions to achieve the greatest advantage in the program environment; and implementing program generality into code to enable its continued operation with little or no modifications.
Read the paper (PDF)
Kirk Paul Lafler, Software Intelligence Corporation
Session 0929-2017:
Building a Member-Centric World from a Transactional Data Galaxy
Health insurers have terabytes of transactional data. However, transactional data does not tell a member-level story. Humana Inc. is often faced with requirements for tagging (identifying) members with various clinical conditions such as diabetes, depression, hypertension, hyperlipidemia, and various member-level utilization metrics. For example, Consumer Health Tags are built to identify the condition (that is, diabetes, hypertension, and so on) and to estimate the intensity of the disease using medical and pharmacy administrative claims data. This case study takes you on an analytics journey from the initial problem diagnosis and analytics solution using SAS®.
Read the paper (PDF)
Brian Mitchell, Humana Inc.
Session 0865-2017:
Building an Analytics Culture at a 114-year-old Regulated Electric Utility
Coming off a recent smart grid implementation, OGE Energy Corp. was collecting more data than at any time in its history. This data held the potential to help the organization uncover new insights and chart new paths. Find out how OGE Energy is building a culture of data analytics by using SAS® tools, a distributed analytics model, and an analytics center of excellence.
Clayton Bellamy, OGE Energy Corp
C
Session 1369-2017:
Charting Your Path to Using the “New” SAS® ODS and SG Graphics Successfully
SAS® Output Delivery System (ODS) Graphics started appearing in SAS® 9.2. Collectively these new tools were referred to as 'ODS Graphics,' 'SG Graphics' and 'Statistical Graphics'. When first starting to use these tools, the traditional SAS/GRAPH® software user might come upon some very significant challenges in learning the new way to do things. This is further complicated by the lack of simple demonstrations of capabilities. Most graphs in training materials and publications are rather complicated graphs that, while useful, are not good teaching examples for starting purposes. This paper contains many examples of very simple ways to get very simple things accomplished. Many different graphs are developed using only a few lines of code each, using data from the SASHELP data sets. The use of the SGPLOT, SGPANEL, and SGSCATTER procedures are shown. In addition, the paper addresses those situations in which the user must alternatively use a combination of the TEMPLATE and SGRENDER procedures to accomplish the task at hand. Most importantly, the use of the 'ODS Graphics Designer' as a teaching tool and a generator of sample graphs and code are covered. This tool makes use of the TEMPLATE and SGRENDER Procedures, generating Graphics Template Language (GTL) code. Users get extremely productive fast. The emphasis in this paper is the simplicity of the learning process. Users will be able to take the generated code and run it immediately on their personal machines.
Read the paper (PDF) | View the e-poster or slides (PDF)
Roger Muller, Data-to-Events
Session 1098-2017:
Classroom Success with SAS® Grid Manager and SAS® Visual Analytics: Coping With Big Data
The Institute for Advanced Analytics struggled to provide student computing environments capable of analyzing increasingly larger data sets for its Master of Science in Analytics program. For the fast-paced practicum, the centerpiece of the curriculum, waiting 24 hours for a FREQ procedure to complete was unacceptable. Practicum proposals from industry were pared down (or turned down) because the data sets were too large, depriving students of exciting and relevant learning experiences. By augmenting the practicum architecture with an 18-node computing cluster running SAS® Grid Manager, SAS® Visual Analytics, and the latest high-performance SAS® procedures, we were able to dramatically increase performance and begin accepting terabyte-scale practicum proposals from industry. In this paper, we discuss the benefits and lessons learned through adding these SAS products to our analytics degree program including capability versus complexity tradeoffs, and the state of our current capabilities and limitations with this architecture.
Read the paper (PDF)
John Jernigan, Institute for Advanced Analytics at NC State University
Ken Gahagan, SAS
Cheryl Doninger, SAS
Session 1288-2017:
Creating a Departmental Standard SAS® Enterprise Guide® Template
This presentation describes an ongoing effort to standardize and simplify SAS® coding across a rapidly growing analytics team in the health care industry. The number of SAS analysts in Kaiser Permanente's Data and Information Management Enhancement (DIME) department has nearly doubled in the past two years, going from approximately 20 to 40 analysts. The level of experience and technical skill varies greatly within the department. Analysts are required to provide quick turn-around on a large volume of analytical requests in this dynamic and high-demand environment. An effort was initiated in 2016 to create a SAS® Enterprise Guide® Template to standardize and simplify SAS coding across the department. The SAS Enterprise Guide® template is designed to be a standard project file containing predefined code shells and examples that can be used as a basis for all new SAS Enterprise Guide® projects. The primary goals of the template are to: 1) Effectively onboard new analysts to department standards; 2) Increase the efficiency of SAS development; 3) Bring consistency to how SAS is used; and 4) Simplify the transitioning of SAS jobs to the department's Production Support team. This presentation focuses on the process in which the template was initiated, drafted, and socialized across a large and diverse team of SAS analysts. It also highlights plans for ongoing maintenance of and improvements to the original template.
Read the paper (PDF)
Amanda Pasch, Kaiser Permanente
Chris Koppenhafer, Kaiser Permanente
D
Session 0837-2017:
Data Science Rex: How Data Science Is Evolving (or Facing Extinction) across the Academic Landscape
The discipline of data science has seen an unprecedented evolution from primordial darkness to becoming the academic equivalent of an apex predator on university campuses across the country. But, survival of the discipline is not guaranteed. This session explores the genetic makeup of programs that are likely to survive, the genetic makeup of those that are likely to become extinct, and the role that the business community plays in that evolutionary process.
Read the paper (PDF)
Jennifer Priestley, Kennesaw State University
E
Session 1068-2017:
Establishing an Agile, Self-Service Environment to Empower Agile Analytic Capabilities
Creating an environment that enables and empowers self-service and agile analytic capabilities requires a tremendous amount of working together and extensive agreements between IT and the business. Business and IT users are struggling to know what version of the data is valid, where they should get the data from, and how to combine and aggregate all the data sources to apply analytics and deliver results in a timely manner. All the while, IT is struggling to supply the business with more and more data that is becoming available through many different data sources such as the Internet, sensors, the Internet of Things, and others. In addition, once they start trying to join and aggregate all the different types of data, the manual coding can be very complicated and tedious, can demand extraneous resources and processing, and can negatively impact the overhead on the system. If IT enables agile analytics in a data lab, it can alleviate many of these issues, increase productivity, and deliver an effective self-service environment for all users. This self-service environment using SAS® analytics in Teradata has decreased the time required to prepare the data and develop the statistical data model, and delivered faster results in minutes compared to days or even weeks. This session discusses how you can enable agile analytics in a data lab, leverage SAS analytics in Teradata to increase performance, and learn how hundreds of organizations have adopted this concept to deliver self-service capabilities in a streamlined process.
Bob Matsey, Teradata
David Hare, SAS
F
Session 0863-2017:
Framework for Strategic Analysis in Higher Education
Higher education institutions have a plethora of analytical needs. However, the irregular and inconsistent practices in connecting those needs with appropriate analytical delivery systems have resulted in a patchwork this patchwork sometimes overlaps unnecessarily and sometimes exposes unaddressed gaps. The purpose of this paper is to examine a framework of components for addressing institutional analytical needs, while leveraging existing institutional strengths to maximize analytical goal attainment most effectively and efficiently. The core of this paper is a focused review of components for attaining greater analytical strength and goal attainment in the institution.
Read the paper (PDF)
Glenn James, Tennessee Tech University
Session 0810-2017:
Freedom to Inspire and Achieve Excellence
Innovation in teaching and assessment has become critical for many reasons. This is especially true in the fields of data science and big data analytics. Reasons range from the need to significantly improve the development of soft skills (as reported in an e-skills UK and SAS® joint report from November 2014), to the rapidly changing software standards of products used by students, to the rapidly increasing range of functionality and product set, to the need to develop lifelong learning skills to learn new software and functionality. And, this is just a few of the reasons. In some educational institutions, it is easy to be extremely innovative. However, in many institutions and countries, there are numerous constraints on the levels of innovation that can be implemented. This presentation captures the author's developing pedagogic practice at the University of Derby. He suggests fundamental changes to the classic approaches to teaching and assessing data science and big data analytics. These changes have resulted in significant improvement in student engagement and achievements and students soft skills. Improvements are illustrated by innovations in teaching SAS to first-year students and teaching IBM Bluemix and Watson Analytics to final-year students. Students have successfully developed both technical and soft skills and experienced excellent levels of achievement.
Read the paper (PDF)
Richard Self, University of Derby
Session 1385-2017:
From Researcher to Programmer: Five SAS® Tips I Wished I Knew Then
Having crossed the spectrum from an epidemiologist and researcher (where ad hoc is a way of life and where research is the main focus) to a SAS® programmer (writing reusable code for automation and batch jobs, which require no manual interventions), I have learned a few things that I wish I had known as a researcher. These things would not only have helped me to be a better SAS programmer, but they also would have saved me time and effort as a researcher by enabling me to have well-organized, accurate code (that I didn't accidentally remove) and code that would work when I ran it again on another date. This poster presents five SAS tips that are common practice among SAS programmers. I provide researchers who use SAS with tips that are handy and useful, and I provide code (where applicable) that they can try out at home. Using the tips provided will make any SAS programmer smile when they are presented with your code (not guaranteed, but your results should not vary by using these tips).
View the e-poster or slides (PDF)
Crystal Carel, Baylor Scott & White Health
H
Session 1512-2017:
Hot Topics for Analytics in Higher Education
This panel discusses a wide range of topics related to analytics in higher education. Panelists are from diverse institutions and represent academic research, information technology, and institutional research. Challenges related to data acquisition and quality, system support, and meeting customer needs are covered. Topics such as effective dashboards and reporting, big data, predictive analytics, and more are on the agenda.
Stephanie Thompson, Datamum
Glenn James, Tennessee Tech University
Robert Jackson, University of Memphis
Sean Mulvenon, University of Arkansas
Carlos Piemonti, University of Central Florida
Richard Dirmyer, Rochester Institute of Technology
L
Session 1325-2017:
Learn SAS® Programming Features to Step Up toward Team Management
Managing your career future involves learning outside the box at all stages. The next step is not always on the path we planned as opportunities develop and must be taken when we are ready. Prepare with this paper, which explains important features of Base SAS® that support teams. In this presentation, you learn about the following: concatenating team shared folders with personal development areas; creating consistent code; guidelines for a team (not standards); knowing where the documentation will provide the basics; thinking of those who follow (a different interface); creating code for use by others; and how code can learn about the SAS environment.
Read the paper (PDF)
Peter Crawford, Crawford Software Consultancy Limited
M
Session 1432-2017:
Make a University Partnership Your Secret Weapon for Finding Data Science Talent
In this panel session, professors from three geographically diverse universities explain what makes for an effective partnership with private sector companies. Specific examples are discussed from health care, insurance, financial services, insurance, and retail. The panelists discuss what works, what doesn t, and what both parties need to be prepared to bring to the table for a long-term, mutually beneficial partnership.
Jennifer Priestley, Kennesaw State University
Session 1447-2017:
Making SAS® Education Relevant to the Future Workforce
SAS® education is a mainstay across disciplines and educational levels in the United States. Along with other courses that are relevant to the jobs students want, independent SAS courses or SAS education integrated into additional courses can help a student be more interesting to a potential employer. The multitude of SAS offerings (SAS® University Edition, Base SAS®, SAS® Enterprise Guide®, SAS® Studio, and the SAS® OnDemand offerings) provide the tools for education, but reaching students where they are is the greatest key for making the education count. This presentation discusses several roadblocks to learning SAS® syntax or point-and-click from the student perspective and several solutions developed jointly by students and educators in one graduate educational program.
Read the paper (PDF)
Charlotte Baker, Florida A&M University
Matthew Dutton, Florida A&M University
P
Session 1116-2017:
Protecting the Innocent (and Your Data)
A recurring problem with large research databases containing sensitive information about an individual's health, financial, and personal information is how to make meaningful extracts available to qualified researchers without compromising the privacy of the individuals whose data is in the database. This problem is exacerbated when a large number of extracts need to be made from the database. In addition to using statistical disclosure control methods, this paper recommends limiting the variables included in each extract to the minimum needed and implementing a method of assigning request-specific randomized IDs to each extract that is both secure and self-documenting.
Read the paper (PDF)
Stanley Legum, Westat
Q
Session 0928-2017:
Quick Results with PROC SQL
SQL is a universal language that allows you to access data stored in relational databases or tables. This hands-on workshop presents core concepts and features of using PROC SQL to access data stored in relational database tables. Attendees learn how to define, access, and manipulate data from one or more tables using PROC SQL quickly and easily. Numerous code examples are presented on how to construct simple queries, subset data, produce simple and effective output, join two tables, summarize data with summary functions, construct BY-groups, identify FIRST. and LAST. rows, and create and use virtual tables.
Read the paper (PDF) | Download the data file (ZIP)
Kirk Paul Lafler, Software Intelligence Corporation
Session 0173-2017:
Quick Results with SAS® Enterprise Guide®
SAS® Enterprise Guide® empowers organizations, programmers, business analysts, statisticians, and end users with all the capabilities that SAS has to offer. This hands-on workshop presents the SAS Enterprise Guide graphical user interface (GUI). It covers access to multi-platform enterprise data sources, various data manipulation techniques that do not require you to learn complex coding constructs, built-in wizards for performing reporting and analytical tasks, the delivery of data and results to a variety of mediums and outlets, and support for data management and documentation requirements. Attendees learn how to use the graphical user interface to access SAS® data sets and tab-delimited and Microsoft Excel input files; to subset and summarize data; to join (or merge) two tables together; to flexibly export results to HTML, PDF, and Excel; and to visually manage projects using flow diagrams.
Read the paper (PDF)
Kirk Paul Lafler, Software Intelligence Corporation
Ryan Lafler
Session 0998-2017:
Quick Results with SAS® University Edition
The announcement of SAS Institute's free SAS® University Edition is an exciting development for SAS users and learners around the world! The software bundle includes Base SAS®, SAS/STAT® software, SAS/IML® software, SAS® Studio (user interface), and SAS/ACCESS® for Windows, with all the popular features found in the licensed SAS versions. This is an incredible opportunity for users, statisticians, data analysts, scientists, programmers, students, and academics everywhere to use (and learn) for career opportunities and advancement. Capabilities include data manipulation, data management, comprehensive programming language, powerful analytics, high-quality graphics, world-renowned statistical analysis capabilities, and many other exciting features. This paper illustrates a variety of powerful features found in the SAS University Edition. Attendees will be shown a number of tips and techniques on how to use the SAS® Studio user interface, and they will see demonstrations of powerful data management and programming features found in this exciting software bundle.
Read the paper (PDF)
Ryan Lafler
R
Session 1307-2017:
Red Rover, Red Rover, Send Data Right Over: Exploring External Geographic Data Sources with SAS®
The intrepid Mars Rovers have inspired awe and curiosity and dreams of mapping Mars using SAS/GRAPH® software. This presentation demonstrates how to import Esri shapefile (SHP) data (using the MAPIMPORT procedure) from sources other than SAS® and GfK GeoMarketing map data to produce useful (and sometimes creative) maps. Examples include mapping neighborhoods, ZCTA5 areas, postal codes, and of course, Mars. Products used are Base SAS® and SAS/GRAPH®. SAS programmers of any skill level will benefit from this presentation.
Read the paper (PDF)
Louise Hadden, Abt Associates
Session 1118-2017:
Removing Personally Identifiable Information
At the end of a project, many institutional review boards (IRBs) require project directors to certify that no personally identifiable information (PII) is retained by a project. This paper briefly reviews what information is considered PII and explores how to identify variables containing PII in a given project. It then shows a comprehensive way to ensure that all SAS® variables containing PII have their values set to NULL and how to use SAS to document that this has been done.
Read the paper (PDF)
Stanley Legum, Westat
Session 0838-2017:
Revolutionizing Statistical Computing in SAS® with the Jupyter Notebook
From state-of-the-art research to routine analytics, the Jupyter Notebook offers an unprecedented reporting medium. Historically, tables, graphics, and other types of output had to be created separately, and then integrated into a report piece by piece, amidst the drafting of text. The Jupyter Notebook interface enables you to create code cells and markdown cells in any arrangement. Markdown cells allow all typical formatting. Code cells can run code in the document. As a result, report creation happens naturally and in a completely reproducible way. Handing a colleague a Jupyter Notebook file to be re-run or revised is much easier and simpler for them than passing along, at a minimum, two files: one for the code and one for the text. Traditional reports become dynamic documents that include both text and living SAS® code that is run during document creation. With the new SAS kernel for Jupyter, all of this is possible and more!
Read the paper (PDF)
Hunter Glanz
S
Session 1311-2017:
SAS/GRAPH® and GfK GeoMarketing Maps: a Subject Matter Expert Winning Combination
SAS® has an amazing arsenal of tools for using and displaying geographic information that are relatively unknown and underused. High-quality GfK GeoMarketing maps have been provided by SAS since the second maintenance release for SAS® 9.3, as sources for inexpensive map data dried up. SAS has been including both GfK and traditional SAS map data sets with licenses for SAS/GRAPH® software for some time, recognizing there will need to be an extended transitional period. However, for those of us who have been putting off converting our SAS/GRAPH mapping programs to use the new GfK maps, the time has come, as the traditional SAS map data sets are no longer being updated. If you visit SAS® Maps Online, you can find only GfK maps in current maps. The GfK maps are updated once a year. This presentation walk through the conversion of a long-standing SAS program to produce multiple US maps for a data compendium to take advantage of GfK maps. Products used are Base SAS® and SAS/GRAPH®. SAS programmers of any skill level will benefit from this presentation.
Read the paper (PDF)
Louise Hadden, Abt Associates
Session 1157-2017:
Statistical Volunteering with SAS: Experiences and Opportunities
This presentation brings together experiences from SAS® professionals working as volunteers for organizations, charities, and in academic research. Pro bono work, much like that done by physicians, attorneys, and professionals in other areas, is rapidly growing in statistical practice as an important part of a statistical career, offering the opportunity to use your skills in a places where they are so needed but cannot be supported in a for-pay position. Statistical volunteers also gain important learning experiences, mentoring, networking, and other opportunities for professional development. The presenter shares experiences from volunteering for local charities, non-governmental organizations (NGOs) and other organizations and causes, both in the US and around the world. The mission, methods, and focus of some organizations are presented, including DataKind, Statistics Without Borders, Peacework, and others.
Read the paper (PDF)
David Corliss, Peace-Work
T
Session 1450-2017:
The Effects of Socioeconomic, Demographic Variables on US Mortality Using SAS® Visual Analytics
Every visualization tells a story. The effectiveness of showing data through visualization becomes clear as these visualizations will tell stories about differences in US mortality using the National Longitudinal Mortality Study (NLMS) data, using the Public-Use Microdata Samples (PUMS) of 1.2 million cases and 122 thousand records of mortality. SAS® Visual Analytics is a versatile and flexible tool that easily displays the simple effects of differences in mortality rates between age groups, genders, races, places of birth (native or foreign), education and income levels, and so on. Sophisticated analyses including logistical regression (with interactions), decision trees, and neural networks that are displayed in a clear, concise manner help describe more interesting relationships among variables that influence mortality. Some of the most compelling examples are: Males who live alone have a higher mortality rate than females. White men have higher rates of suicide than black men.
Read the paper (PDF) | View the e-poster or slides (PDF)
Catherine Loveless-Schmitt, U.S. Census Bureau
Session 0832-2017:
The Elusive Data Scientist: Real-World Analytic Competencies
You've all seen the job posting that looks more like an advertisement for the ever-elusive unicorn. It begins by outlining the required skills that include a mixture of tools, technologies, and masterful things that you should be able to do. Unfortunately, many such postings begin with restrictions to those with advanced degrees in math, science, statistics, or computer science and experience in your specific industry. They must be able to perform predictive modeling, natural language processing, and, for good measure, candidates should apply only if they know artificial intelligence, cognitive computing, and machine learning. The candidate should be proficient in SAS®, R, Python, Hadoop, ETL, real-time, in-cloud, in-memory, in-database and must be a master storyteller. I know of no one who would be able to fit that description and still be able to hold a normal conversation with another human. In our work, we have developed a competency model for analytics, which describes nine performance domains that encompass the knowledge, skills, behaviors, and dispositions that today's analytic professional should possess in support of a learning, analytically driven organization. In this paper, we describe the model and provide specific examples of job families and career paths that can be followed based on the domains that best fit your skills and interests. We also share with participants a self-assessment tool so that they can see where the stack up!
Read the paper (PDF)
Greg Nelson, Thotwave Technologies, LLC.
Session SAS0289-2017:
The Well-Equipped Student: Using SAS® University Edition and E-Learning to Gain SAS® Skills
SAS® programming skills are much in-demand, and numerous free tools are available for students who want to develop those skills. This paper introduces students to SAS® Studio and the Jupyter Notebook interface within SAS® University Edition. To make this introduction more tangible, the paper uses a large data set of baseball statistics as an example. In particular, statistical analysis using SAS® Studio examines the relationship between salary and performance for major leaguers. From importing text files to creating basic statistics to doing a more advanced analysis, this paper shows multiple ways to carry out tasks so that you can choose whichever method works best for you. Additional statistics that use t tests and linear regression are simple with SAS University Edition. For completeness, the paper shows the same code that is used in SAS Studio examples in the context of Jupyter Notebook in SAS University Edition. The paper also provides additional information about SAS e-learning and SAS Certification to show students how to be fully equipped in order to apply themselves to analytics and data exploration.
Read the paper (PDF)
Randy Mullis, SAS
Allison Mahaffey, SAS
Session 1270-2017:
Time Series Analysis and Forecasting in SAS® University Edition
Time series analysis and forecasting have always been popular as businesses realize the power and impact they can have. Getting students to learn effective and correct ways to build their models is key to having successful analyses as more graduates move into the business world. Using SAS® University Edition is a great way for students to learn analysis, and this talk focuses on the time series tasks. A brief introduction to time series is provided, as well as other important topics that are key to building strong models.
Read the paper (PDF)
Chris Battiston
U
Session 0854-2017:
Using the LOGISTIC or SURVEYLOGISTIC Procedure and Weighting of Public-Use Data in the Classroom
The rapidly evolving informatics capabilities of the past two decades have resulted in amazing new data-based opportunities. Large public use data sets are now available for easy download and utilization in the classroom. Days of classroom exercises based on static, clean, easily maneuverable samples of 100 or less are over. Instead, we have large and messy real-world data at our fingertips allowing for educational opportunities not available in years past. There are now hundreds of public-use data sets available for download and analysis in the classroom. Many of these sources are survey-based and require the understanding of weighting techniques. These techniques are necessary for proper variance estimation allowing for sound inferences through statistical analysis. This example uses the California Health Interview Survey to present and compare weighted and non-weighted results using the SURVEYLOGISTIC procedure.
Read the paper (PDF)
Tyler Smith, National University
Besa Smith, Analydata
V
Session 1374-2017:
Visualizing the Demographics of a Large Healthcare Provider's Membership using SAS®
Visualization of complex data can be a valuable tool for researchers and policy makers, and Base SAS® has powerful tools for such data exploration. In particular, SAS/GRAPH® software is a flexible tool that enables the analyst to create a wide variety of data visualizations. This paper uses SAS® to visualize complex demographic data related to the membership of a large American healthcare provider. Kaiser Permanente (KP) has demographic data about 4 million active members in Southern California. We use SAS to create a number of geographic visualizations of KP demographic data related to membership at the census-block level of detail and higher. Demographic data available from the US Census' American Community Survey (ACS) at the same level of geographic organization are also used as comparators to show how the KP membership differs from the demographics of the geographies from which it draws. In addition, we use SAS to create a number of visualizations of KP demographic data related to utilizations (inpatient and outpatient) at the medical center area level through time. As with the membership data, data available from the ACS is used as a comparator to show how patterns of KP utilizations at various medical centers compare to the demographics of the populations that these medical centers serve. The paper will be of interest to programmers learning how to use SAS to visualize data and to researchers interested in the demographics of one of the largest health care providers in the US.
Read the paper (PDF)
Don McCarthy, Kaiser Permanente
Michael Santema, Kaiser Permanente
back to top