Threadlets: Building a Vocabulary of Communication Patterns through Interactive Visual Analysis

TODO

Phase-1: Requirements Collection
  • Interviews with the Red Sift team and literature review
  • Get email thread IDs and merge with Enron database
Phase-2: Analysis & Design
  • Generate statistics from the threads (thread metrics), some ideas:
  • Pace of interaction
  • # of people (active/passive)
  • # of people (same organisation/different/personal)
  • # of inclusion/exclusion 
  • sender diversity (# of unique senders / # of all involved)
  • thread length (in terms of duration (e.g., 5 days?) / in terms of message count)
  • Compile the thread statistics in a csv with tIDs and stats as columns
  • Do preliminary (interactive) visual analysis using R/Tableau/Mondrian etc.
  • Pay special attention to FW emails / check if they are frequent
  • Pay special attention to inclusion/exclusion and their combinations
  • Identify interesting threads/people/time in Enron where stuff happened (literature/articles/etc.)
  • Look for topologies/patterns of communication
  • Broadcasting (announcement)
  • Information
  • Ping-pong
  • Forward-sequences
  • Loop
  • Short discussions
  • Bursty/Long discussions
  • Develop pattern/topology mining scripts to streamline the above
  • Branching
Phase-3: Development
  • Try to integrate the above with the existing D3 prototype
  • simplification
  • aggregate individuals if they have identical send/receive patterns
  • aggregate messages
  • over time
  • by sender → is it any useful?
  • combine two aggregations
  • ordering
  • individuals/groups
  • temporal: when they first involved in the thread
  • engaged: how many messages they send, then receive
  • messages
  • are there any other ordering methods besides temporal one that make sense?
  • absolute scale time
  • minor design tweaks
  • relaxed hovering (voronoi-like) to highlight an individual
  • bbc type with dash border
  • in each individual, add subtle lines to connect an exclusion message to the following inclusion message — void space doesn’t a good job in indicating the exclusion/inclusion pattern but if the exclusion period is long, it is difficult to see (see john.shelk — does he get included back?) — but it might make the vis look busy
Phase-4: Evaluation
  • Identify/evaluate statistics and patterns that deliver value
Phase-5: Integrating with Red Sift’s platform