Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tasklist #1

Open
15 of 26 tasks
earwig opened this issue Mar 31, 2014 · 0 comments
Open
15 of 26 tasks

Tasklist #1

earwig opened this issue Mar 31, 2014 · 0 comments

Comments

@earwig
Copy link
Owner

earwig commented Mar 31, 2014

Labor division

  • Ben K.
    • Database, semantic analysis
  • Severyn
    • Frontend, crawler
  • Ben A.
    • Frontend, semantic analysis

Search functionality and parsing

Checklist

  • regex searching
  • identify language type
  • identify types of code (classes, functions, variables)
  • code written by author
  • identifying coding conventions
  • identifying techniques in code

Frontend

Checklist

  • base search syntax (for without javascript)
  • autocompleting for components of the search (converts to base syntax)
  • scroll through results
  • show preview of relevant snippet with syntax highlighting
  • ability to take down results
  • show cached code
  • projects that a person may have contributed to
  • dates of authorship
  • type code licenses

Crawling

  • tree like recursive crawling algorithm for github
  • crawl new questions for stackoverflow
  • follow correct crawler syntax (robots.txt)
  • recrawling github active projects and contributions
  • crawl source code, modified, authors, url, project name, filename in project
  • 1 thread per site for crawling
  • Semantic analysis when crawling (using tags)

Checklist

  • github
  • stackoverflow
  • bitbucket
  • gittorius
  • local installs of vcs software
  • launchpad
  • google code

Parser

  • Code language without file extensions will be harder to detect
  • Will use pygments, and will attempt to parse with given language
  • Tree of language organized by similarity
    • Will attempt to parse with lowest nodes first, then will go up the language tree to parse

Checklist

  • identity code language
  • parse code with ASTs
  • integrate with backend and search

Backend

  • Constructing a complex sql query
  • Small test database dump in git

Checklist

Other

Tasks

  • advertising job openings
@earwig earwig added this to the Google Presentation milestone Apr 14, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant