Threadlets: Building a Vocabulary of Communication Patterns through Interactive Visual Analysis
TODO
Phase-1: Requirements Collection
Interviews with the Red Sift team and literature review
Get email thread IDs and merge with Enron database
Phase-2: Analysis & Design
Generate statistics from the threads(thread metrics), some ideas:
Pace of interaction
# of people(active/passive)
# of people(same organisation/different/personal)
# of inclusion/exclusion
sender diversity(# of unique senders / # of all involved)
thread length(in terms of duration(e.g., 5 days?) / in terms of message count)
Compile the thread statistics in a csv with tIDs and stats as columns
Do preliminary(interactive) visual analysis using R/Tableau/Mondrian etc.
Pay special attention to FW emails / check if they are frequent
Pay special attention to inclusion/exclusion and their combinations
Identify interesting threads/people/time in Enron where stuff happened(literature/articles/etc.)
Look for topologies/patterns of communication
Broadcasting(announcement)
Information
Ping-pong
Forward-sequences
Loop
Short discussions
Bursty/Long discussions
Develop pattern/topology mining scripts to streamline the above
Branching
Phase-3: Development
Try to integrate the above with the existing D3 prototype
simplification
aggregate individuals if they have identical send/receive patterns
aggregate messages
over time
by sender → is it any useful?
combine two aggregations
ordering
individuals/groups
temporal: when they first involved in the thread
engaged: how many messages they send, then receive
messages
are there any other ordering methods besides temporal one that make sense?
absolute scale time
minor design tweaks
relaxed hovering(voronoi-like) to highlight an individual
bbc type with dash border
in each individual, add subtle lines to connect an exclusion message to the following inclusion message — void space doesn’t a good job in indicating the exclusion/inclusion pattern but if the exclusion period is long, it is difficult to see(see john.shelk — does he get included back?) — but it might make the vis look busy
Phase-4: Evaluation
Identify/evaluate statistics and patterns that deliver value
TODO