Open-Sourcing Thumbtack’s Economic Sentiment Data

Today, we’re very happy to announce that we’re open sourcing the results from Thumbtack’s monthly Economic Sentiment Survey (ESS) series. TheESS captures the attitudes and perspectives of thousands of business owners across the country to gauge how they feel about the economy and their businesses. Now in its fifth year, this survey provides a unique vantage point on the economy, as respondents are largely mobile service professionals with five or fewer employees who operate in households across the United States. Because they are hard to reach, these professionals are frequently overlooked in other surveys of small businesses.

With Thumbtack’s Economic Sentiment Survey data, we can seek to understand relationships between small business sentiments and various economic indicators including unemployment rates and inflation. In addition, our data can provide color to federal and local policy discussions on topics that affect small business owners such as healthcare and local industry regulations.

In this post, we’ll share how you can access the data via our API and start analyzing it in R. But if you’re a Python user, don’t worry — we’ve posted a separate, Python-specific tutorial
on our GitHub repo. And, of course, if you do use our data, we ask that you cite us properly
.

Before we start, it’s important to point out that the data can be cut in a variety of ways: by state, month, industry, or demographic group. To get a comprehensive overview of the scope of our data and find out more about our survey methodology, we recommend reviewing our full documentation on GitHub
. For now, we’ll focus on how to quickly get up and running in accessing the various data we’re now publishing.

How to access ESS data via R

Step 1. Request the data from ESS API

The first thing to note is that our API stores the data in JSON format, so to get it into R, you’ll need to use the httr
and jsonlite
packages, which provide us with efficient, generalizable functions to pull data from web APIs and get them into the right format for analysis in R.* The httr
package allows you to send HTTP requests and receive HTTP responses directly from R, while the jsonlite
package allows you to convert json objects to R objects.**

So, as a starting point, load both libraries: httr
and jsonlite
. Then, use the GET()
function from httr
to make a request to our API.

library(httr) # functions to send request to ESS API

library(jsonlite) # to convert json data type to R object


ss_url <- modify_url(paste0("https://data.thumbtack.com/v1/sentiments/states?"))

ss_response <- GET(ss_url)

Step 2. Check status of the request

Run the variable with the response saved to check the status code or you can apply the status_code()
function to your response variable.

status_code(ss_response)

If you’re pulling data via multiple iterations, such as sentiment scores by Age, then implementing either warn_for_status(x)
or stop_for_status(x)
will display a message or break the loop, respectively if there is an issue with the status code. This is useful to include to catch any errors as soon as they occur.

Step 3. Retrieve content of the request

If the status is OK (200), then proceed to retrieve the contents of the request as a JSON string using content()
as type text.*** Formats to retrieve the content include “raw”, “text”, and “parsed”. While you can use “parsed” to retrieve an auto parsed R object, in this example, we use “text” to retrieve the content as a character vector.

ss_text <- content(ss_response, "text")

Step 4. Convert content of JSON string to an R object

Then convert the JSON string you’re working with to an R object using fromJSON()
function. This function takes a JSON string, URL, or file.

ss_data <- fromJSON(ss_text, flatten = TRUE)$data

Step 5. Assign index labels to each data pull (optional)

If you pulling data with a certain demographic cut, you want to make sure to index each data pull to keep track of each iteration. For example, if you pull state sentiment scores by Gender, make sure to assign a new column such as ‘index’ to that pull and specify ‘Male’ or ‘Female’ (refer to the data dictionary
for correct index assignment).

Curious to Learn More About The Data?

Check out theESS survey website for an interactive view of this data and monthly summaries on economic sentiments of Thumbtack Pros. For more examples on how to access the ESS Data
via R or Python, check out our tutorials for both on GitHub.

Notes

*For a more comprehensive guide of accessing web data in R, “ A quickstart guide to httr
”, written by Hadley Wickham, is great resource to check out. It provides an overview of the various functionalities of httr
beyond the ones discussed in this post.

** Another way we use the httr package at Thumbtack is to implement Google authentication via oauth tokens for users accessing Shiny dashboards that are only for internal audiences.

*** A complete guide to HTTP status codes is available at http://www.restapitutorial.com/httpstatuscodes.html
.

稿源:Thumbtack Engineering (源链) | 关于 | 阅读提示

本站遵循[CC BY-NC-SA 4.0]。如您有版权、意见投诉等问题,请通过eMail联系我们处理。
酷辣虫 » 综合编程 » Open-Sourcing Thumbtack’s Economic Sentiment Data

喜欢 (0)or分享给?

专业 x 专注 x 聚合 x 分享 CC BY-NC-SA 4.0

使用声明 | 英豪名录

登录

忘记密码 ?

切换登录

注册