Accessibility has become a hot topic on campus due to a flurry of recent investigations of discrimination against students with disabilities by the U.S. Department of Justice and the U.S. Department of Education. This paper provides an update on the latest improvements in SAS® University Edition that are specifically targeted to enable students with disabilities to excel in the classroom and beyond. This paper covers the entire SAS University Edition user experience including installation, documentation, training, support, using SAS® Studio, and the new accessibility features in the fourth maintenance release of SAS® 9.4.
Ed Summers, SAS
Amy Peters, SAS
Uber has changed the face of taxi ridership, making it more convenient and comfortable for riders. But, there are times when customers are dissatisfied because of a shortage of Uber vehicles, which ultimately leads to Uber surge pricing. It's a very difficult task to forecast the number of riders at different locations in a city at different points in time. This gets more complicated with changes in weather. In this paper, we attempt to estimate the number of trips per borough on a daily basis in New York City. We add an exogenous factor weather to this analysis to see how it impacts the changes in the number of trips. We fetched six months worth of data (approximately 9.7 million records) of Uber rides in New York City ranging from January 2015 to June 2015 from GitHub. We gathered weather data (about 3.5 million records) for New York City for the same period from the National Climatic Data Center. We analyzed Uber data and weather data together to estimate the change in the number of trips per borough due to changing weather conditions. We built a model to predict the number of trips per day for a one-week-ahead forecast for each borough of New York City. As part of a further analysis, we got the number of trips on a particular day for each borough. Using time series analysis, we forecast the number of trips that might be required in the near future (probably one week).
Anusha Mamillapalli, Oklahoma State University
Singdha Gutha, Oklahoma State University
Research frequently shows that exposure to sunlight contributes to non-melanoma skin cancer. But, it also shows that sunlight might protect you against multiple sclerosis and breast, ovarian, prostate, and colon cancer. In my study, I explored whether mortality from skin cancer, myocardial infarction, atrial fibrillation, and stroke is associated with exposure to sunlight. I used SAS® 9.4 and RStudio to conduct the entire study. I collected mortality data including cause of death in Los Angeles from 2000 to 2003. In addition, I collected sunlight data for Los Angeles for the same period. There are three types of sunlight in my data global sunlight, diffuse sunlight, and direct sunlight. Data was collected at three different times morning, middle of day, and afternoon. I used two models the Poisson time series regression model and a logistic regression model to investigate the association. I considered a one-year and two-year lag of sunlight association with the types of diseases. I adjusted for age, sex, race, education, temperature, and day of week. Results show that stroke is statistically and significantly associated with a one-year lag of sunlight (p<0.001). Previous epidemiological studies have found that sunlight exposure can ameliorate osteoporosis in stroke patients, and my study provides the protective effects of sunlight on stroke patients.
Wei Xiong, University of Southern California
SAS® Output Delivery System (ODS) Graphics started appearing in SAS® 9.2. Collectively these new tools were referred to as 'ODS Graphics,' 'SG Graphics' and 'Statistical Graphics'. When first starting to use these tools, the traditional SAS/GRAPH® software user might come upon some very significant challenges in learning the new way to do things. This is further complicated by the lack of simple demonstrations of capabilities. Most graphs in training materials and publications are rather complicated graphs that, while useful, are not good teaching examples for starting purposes. This paper contains many examples of very simple ways to get very simple things accomplished. Many different graphs are developed using only a few lines of code each, using data from the SASHELP data sets. The use of the SGPLOT, SGPANEL, and SGSCATTER procedures are shown. In addition, the paper addresses those situations in which the user must alternatively use a combination of the TEMPLATE and SGRENDER procedures to accomplish the task at hand. Most importantly, the use of the 'ODS Graphics Designer' as a teaching tool and a generator of sample graphs and code are covered. This tool makes use of the TEMPLATE and SGRENDER Procedures, generating Graphics Template Language (GTL) code. Users get extremely productive fast. The emphasis in this paper is the simplicity of the learning process. Users will be able to take the generated code and run it immediately on their personal machines.
Roger Muller, Data-to-Events
The Institute for Advanced Analytics struggled to provide student computing environments capable of analyzing increasingly larger data sets for its Master of Science in Analytics program. For the fast-paced practicum, the centerpiece of the curriculum, waiting 24 hours for a FREQ procedure to complete was unacceptable. Practicum proposals from industry were pared down (or turned down) because the data sets were too large, depriving students of exciting and relevant learning experiences. By augmenting the practicum architecture with an 18-node computing cluster running SAS® Grid Manager, SAS® Visual Analytics, and the latest high-performance SAS® procedures, we were able to dramatically increase performance and begin accepting terabyte-scale practicum proposals from industry. In this paper, we discuss the benefits and lessons learned through adding these SAS products to our analytics degree program including capability versus complexity tradeoffs, and the state of our current capabilities and limitations with this architecture.
John Jernigan, Institute for Advanced Analytics at NC State University
Ken Gahagan, SAS
Cheryl Doninger, SAS
This presentation describes an ongoing effort to standardize and simplify SAS® coding across a rapidly growing analytics team in the health care industry. The number of SAS analysts in Kaiser Permanente's Data and Information Management Enhancement (DIME) department has nearly doubled in the past two years, going from approximately 20 to 40 analysts. The level of experience and technical skill varies greatly within the department. Analysts are required to provide quick turn-around on a large volume of analytical requests in this dynamic and high-demand environment. An effort was initiated in 2016 to create a SAS® Enterprise Guide® Template to standardize and simplify SAS coding across the department. The SAS Enterprise Guide® template is designed to be a standard project file containing predefined code shells and examples that can be used as a basis for all new SAS Enterprise Guide® projects. The primary goals of the template are to: 1) Effectively onboard new analysts to department standards; 2) Increase the efficiency of SAS development; 3) Bring consistency to how SAS is used; and 4) Simplify the transitioning of SAS jobs to the department's Production Support team. This presentation focuses on the process in which the template was initiated, drafted, and socialized across a large and diverse team of SAS analysts. It also highlights plans for ongoing maintenance of and improvements to the original template.
Amanda Pasch, Kaiser Permanente
Chris Koppenhafer, Kaiser Permanente
The discipline of data science has seen an unprecedented evolution from primordial darkness to becoming the academic equivalent of an apex predator on university campuses across the country. But, survival of the discipline is not guaranteed. This session explores the genetic makeup of programs that are likely to survive, the genetic makeup of those that are likely to become extinct, and the role that the business community plays in that evolutionary process.
Jennifer Priestley, Kennesaw State University
Innovation in teaching and assessment has become critical for many reasons. This is especially true in the fields of data science and big data analytics. Reasons range from the need to significantly improve the development of soft skills (as reported in an e-skills UK and SAS® joint report from November 2014), to the rapidly changing software standards of products used by students, to the rapidly increasing range of functionality and product set, to the need to develop lifelong learning skills to learn new software and functionality. And, this is just a few of the reasons. In some educational institutions, it is easy to be extremely innovative. However, in many institutions and countries, there are numerous constraints on the levels of innovation that can be implemented. This presentation captures the author's developing pedagogic practice at the University of Derby. He suggests fundamental changes to the classic approaches to teaching and assessing data science and big data analytics. These changes have resulted in significant improvement in student engagement and achievements and students soft skills. Improvements are illustrated by innovations in teaching SAS to first-year students and teaching IBM Bluemix and Watson Analytics to final-year students. Students have successfully developed both technical and soft skills and experienced excellent levels of achievement.
Richard Self, University of Derby
Having crossed the spectrum from an epidemiologist and researcher (where ad hoc is a way of life and where research is the main focus) to a SAS® programmer (writing reusable code for automation and batch jobs, which require no manual interventions), I have learned a few things that I wish I had known as a researcher. These things would not only have helped me to be a better SAS programmer, but they also would have saved me time and effort as a researcher by enabling me to have well-organized, accurate code (that I didn't accidentally remove) and code that would work when I ran it again on another date. This poster presents five SAS tips that are common practice among SAS programmers. I provide researchers who use SAS with tips that are handy and useful, and I provide code (where applicable) that they can try out at home. Using the tips provided will make any SAS programmer smile when they are presented with your code (not guaranteed, but your results should not vary by using these tips).
Crystal Carel, Baylor Scott & White Health
Session 1512-2017:
Hot Topics for Analytics in Higher Education
This panel discusses a wide range of topics related to analytics in higher education. Panelists are from diverse institutions and represent academic research, information technology, and institutional research. Challenges related to data acquisition and quality, system support, and meeting customer needs are covered. Topics such as effective dashboards and reporting, big data, predictive analytics, and more are on the agenda.
Stephanie Thompson, Datamum
Glenn James, Tennessee Tech University
Robert Jackson, University of Memphis
Sean Mulvenon, University of Arkansas
Carlos Piemonti, University of Central Florida
Richard Dirmyer, Rochester Institute of Technology
Session 1432-2017:
Make a University Partnership Your Secret Weapon for Finding Data Science Talent
In this panel session, professors from three geographically diverse universities explain what makes for an effective partnership with private sector companies. Specific examples are discussed from health care, insurance, financial services, insurance, and retail. The panelists discuss what works, what doesn t, and what both parties need to be prepared to bring to the table for a long-term, mutually beneficial partnership.
Jennifer Priestley, Kennesaw State University
SAS® education is a mainstay across disciplines and educational levels in the United States. Along with other courses that are relevant to the jobs students want, independent SAS courses or SAS education integrated into additional courses can help a student be more interesting to a potential employer. The multitude of SAS offerings (SAS® University Edition, Base SAS®, SAS® Enterprise Guide®, SAS® Studio, and the SAS® OnDemand offerings) provide the tools for education, but reaching students where they are is the greatest key for making the education count. This presentation discusses several roadblocks to learning SAS® syntax or point-and-click from the student perspective and several solutions developed jointly by students and educators in one graduate educational program.
Charlotte Baker, Florida A&M University
Matthew Dutton, Florida A&M University
The announcement of SAS Institute's free SAS® University Edition is an exciting development for SAS users and learners around the world! The software bundle includes Base SAS®, SAS/STAT® software, SAS/IML® software, SAS® Studio (user interface), and SAS/ACCESS® for Windows, with all the popular features found in the licensed SAS versions. This is an incredible opportunity for users, statisticians, data analysts, scientists, programmers, students, and academics everywhere to use (and learn) for career opportunities and advancement. Capabilities include data manipulation, data management, comprehensive programming language, powerful analytics, high-quality graphics, world-renowned statistical analysis capabilities, and many other exciting features. This paper illustrates a variety of powerful features found in the SAS University Edition. Attendees will be shown a number of tips and techniques on how to use the SAS® Studio user interface, and they will see demonstrations of powerful data management and programming features found in this exciting software bundle.
Ryan Lafler
From state-of-the-art research to routine analytics, the Jupyter Notebook offers an unprecedented reporting medium. Historically, tables, graphics, and other types of output had to be created separately, and then integrated into a report piece by piece, amidst the drafting of text. The Jupyter Notebook interface enables you to create code cells and markdown cells in any arrangement. Markdown cells allow all typical formatting. Code cells can run code in the document. As a result, report creation happens naturally and in a completely reproducible way. Handing a colleague a Jupyter Notebook file to be re-run or revised is much easier and simpler for them than passing along, at a minimum, two files: one for the code and one for the text. Traditional reports become dynamic documents that include both text and living SAS® code that is run during document creation. With the new SAS kernel for Jupyter, all of this is possible and more!
Hunter Glanz
Every visualization tells a story. The effectiveness of showing data through visualization becomes clear as these visualizations will tell stories about differences in US mortality using the National Longitudinal Mortality Study (NLMS) data, using the Public-Use Microdata Samples (PUMS) of 1.2 million cases and 122 thousand records of mortality. SAS® Visual Analytics is a versatile and flexible tool that easily displays the simple effects of differences in mortality rates between age groups, genders, races, places of birth (native or foreign), education and income levels, and so on. Sophisticated analyses including logistical regression (with interactions), decision trees, and neural networks that are displayed in a clear, concise manner help describe more interesting relationships among variables that influence mortality. Some of the most compelling examples are: Males who live alone have a higher mortality rate than females. White men have higher rates of suicide than black men.
Catherine Loveless-Schmitt, U.S. Census Bureau
SAS® programming skills are much in-demand, and numerous free tools are available for students who want to develop those skills. This paper introduces students to SAS® Studio and the Jupyter Notebook interface within SAS® University Edition. To make this introduction more tangible, the paper uses a large data set of baseball statistics as an example. In particular, statistical analysis using SAS® Studio examines the relationship between salary and performance for major leaguers. From importing text files to creating basic statistics to doing a more advanced analysis, this paper shows multiple ways to carry out tasks so that you can choose whichever method works best for you. Additional statistics that use t tests and linear regression are simple with SAS University Edition. For completeness, the paper shows the same code that is used in SAS Studio examples in the context of Jupyter Notebook in SAS University Edition. The paper also provides additional information about SAS e-learning and SAS Certification to show students how to be fully equipped in order to apply themselves to analytics and data exploration.
Randy Mullis, SAS
Allison Mahaffey, SAS
Time series analysis and forecasting have always been popular as businesses realize the power and impact they can have. Getting students to learn effective and correct ways to build their models is key to having successful analyses as more graduates move into the business world. Using SAS® University Edition is a great way for students to learn analysis, and this talk focuses on the time series tasks. A brief introduction to time series is provided, as well as other important topics that are key to building strong models.
Chris Battiston
The rapidly evolving informatics capabilities of the past two decades have resulted in amazing new data-based opportunities. Large public use data sets are now available for easy download and utilization in the classroom. Days of classroom exercises based on static, clean, easily maneuverable samples of 100 or less are over. Instead, we have large and messy real-world data at our fingertips allowing for educational opportunities not available in years past. There are now hundreds of public-use data sets available for download and analysis in the classroom. Many of these sources are survey-based and require the understanding of weighting techniques. These techniques are necessary for proper variance estimation allowing for sound inferences through statistical analysis. This example uses the California Health Interview Survey to present and compare weighted and non-weighted results using the SURVEYLOGISTIC procedure.
Tyler Smith, National University
Besa Smith, Analydata
Visualization of complex data can be a valuable tool for researchers and policy makers, and Base SAS® has powerful tools for such data exploration. In particular, SAS/GRAPH® software is a flexible tool that enables the analyst to create a wide variety of data visualizations. This paper uses SAS® to visualize complex demographic data related to the membership of a large American healthcare provider. Kaiser Permanente (KP) has demographic data about 4 million active members in Southern California. We use SAS to create a number of geographic visualizations of KP demographic data related to membership at the census-block level of detail and higher. Demographic data available from the US Census' American Community Survey (ACS) at the same level of geographic organization are also used as comparators to show how the KP membership differs from the demographics of the geographies from which it draws. In addition, we use SAS to create a number of visualizations of KP demographic data related to utilizations (inpatient and outpatient) at the medical center area level through time. As with the membership data, data available from the ACS is used as a comparator to show how patterns of KP utilizations at various medical centers compare to the demographics of the populations that these medical centers serve. The paper will be of interest to programmers learning how to use SAS to visualize data and to researchers interested in the demographics of one of the largest health care providers in the US.
Don McCarthy, Kaiser Permanente
Michael Santema, Kaiser Permanente