Practical sql: a beginner’s Guide to Storytelling with Data pdfdrive com


APPENDIX ADDITIONAL POSTGRESQL RESOURCES



tải về 5.77 Mb.
Chế độ xem pdf
trang8/267
Chuyển đổi dữ liệu06.01.2024
Kích5.77 Mb.
#56244
1   ...   4   5   6   7   8   9   10   11   ...   267
Practical SQL A Beginner’s Guide

APPENDIX
ADDITIONAL POSTGRESQL RESOURCES
PostgreSQL Development Environments
PostgreSQL Utilities, Tools, and Extensions
PostgreSQL News
Documentation
INDEX


FOREWORD
When people ask which programming language I learned first, I often
absent-mindedly reply, “Python,” forgetting that it was actually with SQL
that I first learned to write code. This is probably because learning SQL felt
so intuitive after spending years running formulas in Excel spreadsheets. I
didn’t have a technical background, but I found SQL’s syntax, unlike that of
many other programming languages, straightforward and easy to implement.
For example, you run
SELECT *
on a SQL table to make every row and column
appear. You simply use the
JOIN
keyword to return rows of data from
different related tables, which you can then further group, sort, and analyze.
I’m a graphics editor, and I’ve worked as a developer and journalist at a
number of publications, including POLITICO, Vox, and USA TODAY. My
daily responsibilities involve analyzing data and creating visualizations from
what I find. I first used SQL when I worked at The Chronicle of Higher
Education and its sister publication, The Chronicle of Philanthropy. Our team
analyzed data ranging from nonprofit financials to faculty salaries at colleges
and universities. Many of our projects included as much as 20 years’ worth of
data, and one of my main tasks was to import all that data into a SQL
database and analyze it. I had to calculate the percent change in fundraising
dollars at a nonprofit or find the median endowment size at a university to
measure an institution’s performance.
I discovered SQL to be a powerful language, one that fundamentally
shaped my understanding of what you can—and can’t—do with data. SQL
excels at bringing order to messy, large data sets and helps you discover how
different data sets are related. Plus, its queries and functions are easy to reuse
within the same project or even in a different database.
This leads me to Practical SQL. Looking back, I wish I’d read Chapter 4
on “Importing and Exporting Data” so I could have understood the power of
bulk imports instead of writing long, cumbersome
INSERT
statements when
filling a table. The statistical capabilities of PostgreSQL, covered in
Chapters 5 and 10 in this book, are also something I wish I had grasped


earlier, as my data analysis often involves calculating the percent change or
finding the average or median values. I’m embarrassed to say that I didn’t
know how
percentile_cont()
, covered in Chapter 5, could be used to easily
calculate a median in PostgresSQL—with the added bonus that it also finds
your data’s natural breaks or quantiles.
But at that stage in my career, I was only scratching the surface of SQL’s
capabilities. It wasn’t until 2014, when I became a data developer at Gannett
Digital on a team led by Anthony DeBarros, that I learned to use
PostgreSQL. I began to understand just how enormously powerful SQL was
for creating a reproducible and sustainable workflow.
When I met Anthony, he had been working at USA TODAY and other
Gannett properties for more than 20 years, where he had led teams that built
databases and published award-winning investigations. Anthony was able to
show me the ins and outs of our team’s databases in addition to teaching me
how to properly build and maintain my own. It was through working with
Anthony that I truly learned how to code.
One of the first projects Anthony and I collaborated on was the 2014 U.S.
midterm elections. We helped build an election forecast data visualization to
show USA TODAY readers the latest polling averages, campaign finance
data, and biographical information for more than 1,300 candidates in more
than 500 congressional and gubernatorial races. Building our data
infrastructure was a complex, multistep process powered by a PostgreSQL
database at its heart.
Anthony taught me how to write code that funneled all the data from our
sources into a half-dozen tables in PostgreSQL. From there, we could query
the data into a format that would power the maps, charts, and front-end
presentation of our election forecast.
Around this time, I also learned one of my favorite things about
PostgreSQL—its powerful suite of geographic functions (Chapter 14 in this
book). By adding the PostGIS extension to the database, you can create
spatial data that you can then export as GeoJSON or as a shapefile, a format
that is easy to map. You can also perform complex spatial analysis, like
calculating the distance between two points or finding the density of schools
or, as Anthony shows in the chapter, all the farmers’ markets in a given
radius.
It’s a skill I’ve used repeatedly in my career. For example, I used it to


build a data set of lead exposure risk at the census-tract level while at Vox,
which I consider one of my crowning PostGIS achievements. Using this
database, I was able to create a data set of every U.S. Census tract and its
corresponding lead exposure risk in a spatial format that could be easily
mapped at the national level.
With so many different programming languages available—more than
200, if you can believe it—it’s truly overwhelming to know where to begin.
One of the best pieces of advice I received when first starting to code was to
find an inefficiency in my workflow that could be improved by coding. In my
case, it was building a database to easily query a project’s data. Maybe you’re
in a similar boat or maybe you just want to know how to analyze large data
sets.
Regardless, you’re probably looking for a no-nonsense guide that skips
the programming jargon and delves into SQL in an easy-to-understand
manner that is both practical and, more importantly, applicable. And that’s
exactly what Practical SQL does. It gets away from programming theory and
focuses on teaching SQL by example, using real data sets you’ll likely
encounter. It also doesn’t shy away from showing you how to deal with
annoying messy data pitfalls: misspelled names, missing values, and columns
with unsuitable data types. This is important because, as you’ll quickly learn,
there’s no such thing as clean data.
Over the years, my role as a data journalist has evolved. I build fewer
databases now and build more maps. I also report more. But the core
requirement of my job, and what I learned when first learning SQL, remains
the same: know thy data and to thine own data be true. In other words, the
most important aspect of working with data is being able to understand
what’s in it.
You can’t expect to ask the right questions of your data or tell a
compelling story if you don’t understand how to best analyze it. Fortunately,
that’s where Practical SQL comes in. It’ll teach you the fundamentals of
working with data so that you can discover your own stories and insights.
Sarah Frostenson
Graphics Editor at POLITICO



tải về 5.77 Mb.

Chia sẻ với bạn bè của bạn:
1   ...   4   5   6   7   8   9   10   11   ...   267




Cơ sở dữ liệu được bảo vệ bởi bản quyền ©hocday.com 2024
được sử dụng cho việc quản lý

    Quê hương