If you rely on public transportation for your daily commute, you know how frustrating it can be to miss a train. Worse still, if the trains happen to be delayed (or running express past your stop), you may end up walking 20+ minutes to a different stop/line.
This is often the case for me in Brooklyn, where the J and M lines run express past my stop for seemingly no reason. After getting fed up with walking 2 extra stops to catch an express train for the millionth time, I decided to do something about it. I could check Google Maps transit layer on my phone every day, but I honestly never think to pull it up. Instead, I decided to
It turns out that the New York MTA hosts the service status of all train lines and buses in a publicly accessible XML text file on their website. With a little bit of Python scripting, I was able to scrape, clean, and store the subway data. I pushed this data into a Python GUI that allows the user to see the status of each line, and click on the status message to get more details. I loaded the program onto a Raspberry Pi connected to a 7" touchscreen and mounted it by my front door.
See the GUI in action here:
Full breakdown and code below the break.
Materials Used
- Raspberry Pi 2
- SDHC card with Raspbian
- Belkin Wifi Adapter (not needed if using Pi 3)
- Official Raspberry Pi 7" Touchscreen
- Python 3.5
Hardware
Raspberry Pi 2
I happened to have an old Pi 2 laying around from an XBMC media center build I stopped using a couple years ago. I formatted the SD card with a fresh copy of Raspian and added a wifi adapater to connect Pi to my home wLAN.
Official Raspberry Pi Screen
My goal was to keep this build as simple as possible, so I went with the official Pi 7" touchscreen as the display for this project.
The screen itself it pretty much plug and play: display goes via DPI port to the pi, 2 pins to the pi, and 2 pins to the GPIO.
5 minutes later and voila, touchscreen pi!
At this point, the hardware is good to go. Time to write some code.
Code
Scraping the MTA Data
I decided to use Python for this project because it comes preinstalled on the Pi, has support for the Pi's GPIO pins, and has great web scraping libraries. I personally prefer and use Python 3.x over 2.x, but the same could be accomplished in 2.x with a bit of tweaking.
Below is the code for the basic scraping and parsing functionality:
from bs4 import BeautifulSoup
from urllib.request import urlopen
from collections import OrderedDict
url = 'http://web.mta.info/status/serviceStatus.txt'
xml = urlopen(url).read()
soup = BeautifulSoup(xml, "xml")
timestamp = soup.timestamp.text
subway = soup.subway
status_dict = {}
all_lines = subway.findAll('line')
for line in all_lines:
line.find('line')
for info in line:
name = line.find('name').text
status = line.find('status').text
text = line.find('text').text
text = text.replace('<','<').replace('>','>').replace(' ',' ')
if line.find('Date').text == '':
datetime = timestamp
else:
datetime = line.find('Date').text.strip(' ') + ' ' + line.find('Time').text.strip(' ')
status_line = [status, datetime, text]
status_dict[name] = status_line
sorted = OrderedDict(sorted(status_dict.items()))
print(sorted)
In the code above, we start by importing the relevant libraries: BeautifulSoup, URLOpen, and OrderedDict.
from bs4 import BeautifulSoup
from urllib.request import urlopen
from collections import OrderedDict
Next, we setup BeautifulSoup to read in the XML file by providing the URL for the file, opening it, and parsing it with the XML library.
url = 'http://web.mta.info/status/serviceStatus.txt'
xml = urlopen(url).read()
soup = BeautifulSoup(xml, "xml")
At this point, we only care about two XML tags: timestamp and subway. We don't want bus times, and we will need the timestamp tag later. We should also create a dictionary to store all of the data.
timestamp = soup.timestamp.text
subway = soup.subway
status_dict = {}
Within the subway tag, there exists individual lines, which contain tags for the line name, line status, and description of that status. We can write a simple nested loop that pulls out each line from subway, and all 3 tags from each line.
all_lines = subway.findAll('line')
for line in all_lines:
line.find('line')
for info in line:
name = line.find('name').text
status = line.find('status').text
text = line.find('text').text
At this point, we have the data we want and can do some basic cleanup on it. For whatever reason, the MTA doesn't update the individual line time stamp for lines where service status == "GOOD SERVICE". In order to provide a complete data, we need to use the time stamp we got earlier from the parent XML file. This is slightly complicated by the fact that the time stamp in the XML file is in a different format from the time stamp provided by the line status. So, we need to first check if the line time stamp exists, use the parent time stamp if not, or use the line time stamp and reformat it.
text = text.replace('<','<').replace('>','>').replace(' ',' ')
if line.find('Date').text == '':
datetime = timestamp
else:
datetime = line.find('Date').text.strip(' ') + ' ' + line.find('Time').text.strip(' ')
Finally, we take each of these tags and stick them into a list item. This list item goes into a dictionary, which gets sorted and put into an ordered dictionary. This way, the lines will always be returned in the same order. This example then goes on print the dictionary; see the full code below to see how this is integrated into the GUI.
status_line = [status, datetime, text]
status_dict[name] = status_line
sorted = OrderedDict(sorted(status_dict.items()))
print(sorted)
Python GUI
I used Python library Tkinter to build the GUI for this project, which was probably the hardest part of the whole thing. I had 0 experience with Tkinter (or UI design in general) when I started. That said, I think it turned out well.
Full code available on GitHub [fac_icon icon="github"]
### Created by Nick Wallace ###
### ###
### ###
import re
from tkinter import *
from tkinter import font
from bs4 import BeautifulSoup
from urllib.request import urlopen
from PIL import ImageTk, Image
from collections import OrderedDict
root = Tk()
root.title("MTA Service Status")
root.configure(background = "white")
#root.geometry('800x480')
#root.attributes('-fullscreen', True)
root.columnconfigure(0, weight = 1)
root.columnconfigure(1, weight = 1)
root.columnconfigure(2, weight = 1)
root.columnconfigure(3, weight = 1)
header_font = font.Font(family='Heveltica', weight = 'bold', size=13)
main_font = font.Font(family = 'Heveltica', size = 12)
def getData():
global sort
url = 'http://web.mta.info/status/serviceStatus.txt'
xml = urlopen(url).read()
soup = BeautifulSoup(xml, "xml")
timestamp = soup.timestamp.text
subway = soup.subway
status_dict = {}
all_lines = subway.findAll('line')
for line in all_lines:
line.find('line')
for info in line:
name = line.find('name').text
status = line.find('status').text
text = line.find('text').text
text = text.replace('<','<').replace('>','>').replace(' ',' ')
if line.find('Date').text == '':
datetime = re.sub(':[:]*.{2}[:]* {1}', '', timestamp)
else:
datetime = line.find('Date').text.strip(' ') + ' ' + line.find('Time').text.strip(' ')
status_line = [status, datetime, text]
status_dict[name] = status_line
sort = OrderedDict(sorted(status_dict.items()))
return sort
def firstRun(dict):
global labels
labels = {}
header_img = Label(root, text = 'LINE', font = header_font, bg = 'white')
header_img.grid(columnspan = 2)
header_name = Label(root, text = 'STATUS', anchor = 'center', font = header_font, bg = 'white')
header_name.grid(row = 0, column = 2)
header_timestamp = Label(root, text = 'TIME', anchor = 'center', font = header_font, bg = 'white')
header_timestamp.grid(row = 0, column = 3)
rc = 1
for k, v in dict.items():
img_url = 'c://Temp/imgs/' + k + '.png'
img = ImageTk.PhotoImage(Image.open(img_url))
Grid.rowconfigure(root, rc, weight=1)
Grid.columnconfigure(root, rc, weight=1)
line_img = Label(root, image = img, bg = 'white')
line_img.image = img
line_img.grid(row = rc, columnspan = 2)
#line_name = Label(root, text = k, bg = 'white')
#line_name.grid(row = rc, column = 1, sticky = W)
labels[(rc, 1)] = k
line_status = Label(root, text = ' ' + v[0], font = main_font, bg = 'green' if v[0] == 'GOOD SERVICE' else 'yellow' if v[0] == 'PLANNED WORK' else 'red')
line_status.grid(row = rc, column = 2)
if v[0] not in ['GOOD SERVICE']: line_status.bind('', addMessage)
labels[(rc, 2)] = line_status
line_timestamp = Label(root, text = ' ' + v[1], font = main_font, bg = 'white')
line_timestamp.grid(row = rc, column = 3)
labels[(rc, 3)] = line_timestamp
rc += 1
blank_line = Label(root, text = '', bg = 'white')
blank_line.grid(row = rc)
def addMessage(event):
global msg_label
grid_info = event.widget.grid_info()
line = labels[(grid_info['row'],1)]
msg_text = re.sub('<[^>]*>', '', sort[line][2])
msg_text = re.sub('&[^;]*;', ' ', msg_text)
msg_text = re.sub('\n+', '\n', msg_text)
msg_text = re.sub(' +', ' ', msg_text)
msg_text = re.sub('Show.*?Note:', '', msg_text)
msg_text = re.sub('Key.*?Note:', '', msg_text)
msg_text = re.sub(r' [ad].*relay.', '', msg_text)
msg_label = Label(root, text = msg_text, anchor = 'w', justify = 'center', wraplength=480)
msg_label.grid(row = grid_info['row'] + 1, columnspan = 4)
msg_label.bind('', removeMessage)
def removeMessage(event):
msg_label.grid_forget()
def refresh(dict):
rc = 1
for k, v in dict.items():
labels[(rc,2)].config(text = ' ' + v[0], bg = 'green' if v[0] == 'GOOD SERVICE' else 'yellow' if v[0] == 'PLANNED WORK' else 'red')
labels[(rc,3)].config(text = ' ' + v[1])
rc += 1
def exit():
root.quit()
firstRun(getData())
refreshButton = Button(root, text = "Refresh", command =lambda: refresh(getData()), height = 1, width = 15).grid(row = 15, column = 0, columnspan = 3)
exitButton = Button(root, text = "Exit", command = exit, height = 1, width = 15).grid(row = 15, column = 2, columnspan = 3)
blank_line = Label(root, text = '', bg = 'white')
blank_line.grid(row = 16)
mainloop()
GUI in Action
The GUI is still a WIP, but click below to see the basic functionality: