Systems Research

Systems Research

  • Systems research: solving problems in computer systems
    • Performance, energy consumption, cost, security, etc.
  • Worst research: try to solve non-existing problems
    • Make the performance of a keyboard device driver 10x faster
    • Does the performance of the keyboard device driver really matter?
  • Finding real, important problems are critical

How to find real, important problem

  • Read recent conference/workshop papers
    • What problem area is important
    • What problems are already solved and what are not solved yet
    • What are commonly used techniques in this field
    • What are commonly agreed ways to evaluate a proposed methodology
  • However, a paper is the result of at least one-year effort
    • You are already two years behind than the authors

How to find real, important problem as early as possible

  • Attend top-tier conference and chat with researchers
    • What are you working on these days?
    • Good to get a sense of research direction
    • If you are lucky, you can find a new, interesting problem
  • Build a prototype of an existing system
    • Best way to have deepest understanding of an existing system
    • It takes time so it is risky if there is no new, interesting problem
  • Benchmarking
    • Economic way to find new, important problems

How to use benchmarking to discover new problems

  • Run existing benchmark in new setting
    • E.g., run mosbench on 200 core machine -> Can we find any new bottleneck?
    • E.g., run mosbench on docker, virtual machine, or unikernel -> Does new isolation mechanism introduce new overhead?
    • E.g., run FxMark on an extremely-fast NVM or a slow SMR drive -> How does performance characteristics of storage device affect performance and scalability of storage stack?

How to use benchmarking to discover new problems

  • Compare existing approaches with the same benchmark
    • Can have deeper understanding of existing approaches beyond what described in the paper
    • Can draw design principles of a new approach

What are important in benchmarking

  • This is not an one time job
    • Automation, automation, automation!
  • Without automation, this is not a reproducible science.
    • We human do make mistakes.
    • Automation is one of way to avoid mistakes (or make the same mistakes consistently)
  • If we are copying log files here and there, this is the sign that we should automate

What are important in benchmarking

  • Automate as much as possible
    • Run all necessary combinations of experiments
    • Generate graphs
    • (Email generated graphs automatically)
  • It does affect our productivity a lot!
    • Run a benchmark script before having meeting (or going to bed, leaving office, etc)
    • After meeting, check graph and devise what to run next or what to profile
  • bash, awk, and python are our friends

Examples of good benchmarks

  • Mosbench (or our forked vbench)
    • https://pdos.csail.mit.edu/archive/mosbench/
    • https://github.com/sslab-gatech/vbench
  • FxMark
    • https://github.com/sslab-gatech/fxmark

Tips for more meaningful benchmarking

  • Make your results easy to compare
    • # transactions/msec instead of # transactions
    • Abort ratio instead of # aborts
  • Double check benchmark parameters
    • number of initial elements in a hash table
    • update ratio
    • random distribution: uniform vs. zipf

Tips for more meaningful benchmarking

  • Know your hardware
  • Avoid well-known bottlenecks
    • Use scalable memory allocator, such as jemalloc or tcmalloc
  • Use git, tmux, and gnuplot, seriously