Exploration of clustering techniques over geographical and other dimensions
This post explains attempts to define more homogeneous neighborhoods for a project about predicting gentrification in Philadelphia. For more context on the project as a whole, see this post.
Neighborhoods are an important characteristic of cities, many with evolving subcultures and lifestyles within them. Computationally, defining the boundaries of a neighborhood can be very difficult. Measures over predefined spatial boundaries can lead to a misrepresentation of the data, known as the Modifiable Areal Unit Problem (MAUP). Instead, clustering techniques can be used to delineate more internally homogenous regions for analysis. …
Leveraging spatial indices for geospatial feature engineering
This post explains k-nearest neighbors as a feature engineering technique in geospatial machine learning for a project about predicting gentrification in Philadelphia. For more context on the project as a whole, see this post.
Many geospatial datasets include data detailing locations of specific events, such as incidents of crime. In order to use this crucial data as features for house price prediction, this city-wide data had to be converted into a consistent, per-parcel feature. One technique I used was identifying the average distance to the k nearest neighbors of each event. For example, one feature could be the average distance to the 10 closest recorded crimes over each year. …
Over the last eight years, the Philadelphia housing market has turned around from recession and is primed to accelerate. At the same time, thousands of impoverished tenants struggle to find and maintain reasonably priced housing. Affordable housing initiatives have not come without criticisms regarding the placement of new housing projects, particularly with the concentration of new developments in already low-income areas. While one can argue that locating affordable housing projects in these areas keeps tenants close to their existing communities, it also concentrates them away from possible economic growth and social mobility.
Gentrification is a major source of neighborhood change and is key to understanding the shift in housing price over time. In the interest of drawing in higher income residents, the public sector works to provide amenities and leisure at higher standards. Focus on gentrifying areas pulls funds away from other generally lower income areas. These neighborhoods then go through periods of disinvestment, or neighborhood decline. As the urban center expands and surrounding land value increases, disinvested regions become primed for future cycles of gentrification. This process is evident in Philadelphia, where 13,000 lower-cost units were lost while 6,000 high end units were added from 2008 to 2016 alone. …
A journey through the SoundCloud network
Though SoundCloud’s journey has been anything but stable, one key advantage they have over other music streaming services is a prime combination of content distribution and social networks. As a novice producer starting out myself, I asked the question on everyone’s mind: How do songs go viral? Using networks, I tried to figure it out.
My original hypothesis was based on the idea of “Mavens”, coined in Malcolm Gladwell’s Tipping Point.
“Mavens are […] information specialists who we rely on to connect us to new information.”
In this case, a Maven would be the one friend who always shares the music they’ve listened to recently, or the one whose playlists you have on repeat. These are people who seek out new music and, importantly, actively disseminate it to their immediate network. …