Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dependency activity collection #1609

Open
wants to merge 29 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
8cafe8a
WIP dependency activity collection
andrew Jul 19, 2017
e96a851
Update repo_miner
andrew Jul 20, 2017
8d710bb
Save mined activities to the database
andrew Jul 20, 2017
f87e4fd
Skip previously mined commits
andrew Jul 20, 2017
dd1265e
Fix typo
andrew Jul 20, 2017
6293412
Better error handling
andrew Jul 20, 2017
04d5236
Update bibliothecary and repo_miner
andrew Jul 25, 2017
7359657
Projects have dependency activities
andrew Jul 25, 2017
c2329d5
Only clone the default branch
andrew Jul 28, 2017
10a7113
Import dependency activities in one sql query using activerecord-import
andrew Jul 28, 2017
d03922b
Update appsignal
andrew Jul 28, 2017
e69c83b
Skip mining dependency activities from forks
andrew Jul 28, 2017
d208c3c
WIP dependency activity collection
andrew Jul 19, 2017
345fb1a
Update repo_miner
andrew Jul 20, 2017
cde7641
Save mined activities to the database
andrew Jul 20, 2017
f9066ef
Skip previously mined commits
andrew Jul 20, 2017
3858e37
Fix typo
andrew Jul 20, 2017
506670e
Better error handling
andrew Jul 20, 2017
0b1f532
Update bibliothecary and repo_miner
andrew Jul 25, 2017
817600b
Projects have dependency activities
andrew Jul 25, 2017
3658a9c
Only clone the default branch
andrew Jul 28, 2017
dcf8604
Import dependency activities in one sql query using activerecord-import
andrew Jul 28, 2017
29b0a58
Skip mining dependency activities from forks
andrew Jul 28, 2017
f31de5b
Merge branch 'master' into dependency_activities
andrew Sep 12, 2017
23ffa4b
Merge branch 'dependency_activities' of https://github.com/librariesi…
andrew Sep 12, 2017
7316582
Merge branch 'master' into dependency_activities
andrew Nov 24, 2017
797c105
Fix elasticsearch import
andrew Nov 24, 2017
10abc64
Avoid conflict with elasticsearch import method
andrew Nov 24, 2017
46cfb2d
Store branch when mining dependency activities
andrew Nov 27, 2017
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 2 additions & 0 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ gem 'jquery-rails'
gem 'octokit'
gem 'bootstrap-sass'
gem 'will_paginate-bootstrap'
gem 'activerecord-import'
gem 'elasticsearch', '~> 2'
gem 'elasticsearch-model'
gem 'elasticsearch-rails'
Expand Down Expand Up @@ -80,6 +81,7 @@ gem 'pghero'
gem 'pg_query'
gem 'schema_plus_pg_indexes'
gem 'autoprefixer-rails', '~> 7.1.2.1'
gem 'repo_miner'
gem 'amatch'
gem 'concurrent-ruby-ext'
gem 'charlock_holmes', '>= 0.7.5'
Expand Down
7 changes: 7 additions & 0 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,8 @@ GEM
activemodel (= 5.0.6)
activesupport (= 5.0.6)
arel (~> 7.0)
activerecord-import (0.21.0)
activerecord (>= 3.2)
activesupport (5.0.6)
concurrent-ruby (~> 1.0, >= 1.0.2)
i18n (~> 0.7)
Expand Down Expand Up @@ -383,6 +385,9 @@ GEM
rb-readline (0.5.5)
rdoc (5.1.0)
redis (4.0.1)
repo_miner (0.3.1)
bibliothecary
rugged
request_store (1.3.2)
rspec (3.7.0)
rspec-core (~> 3.7.0)
Expand Down Expand Up @@ -539,6 +544,7 @@ PLATFORMS
DEPENDENCIES
RedCloth
active_model_serializers
activerecord-import
amatch
api-pagination
appsignal (~> 2.3.0)
Expand Down Expand Up @@ -619,6 +625,7 @@ DEPENDENCIES
rb-readline
rdoc
redis
repo_miner
rspec-rails
rspec-sidekiq
rspec_junit_formatter
Expand Down
95 changes: 95 additions & 0 deletions app/models/concerns/dependency_miner.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
module DependencyMiner
def mine_dependencies
return if scm == 'hg' # only works with git repositories at the moment
return if fork?

tmp_dir_name = "#{host_type}-#{owner_name}-#{project_name}".downcase

tmp_path = Rails.root.join("tmp/#{tmp_dir_name}")

# download code
system "git clone -b #{default_branch} --single-branch #{url} #{tmp_path}"

return unless tmp_path.exist? # handle failed clones

# mine dependency activity from git repository
miner = RepoMiner::Repository.new(tmp_path.to_s)

# Find last commit analysed
last_commit_sha = dependency_activities.order('committed_at DESC').first.try(:commit_sha)

# store activities as DependencyActivity records
commits = miner.analyse(default_branch, last_commit_sha)

# only consider commits with dependency data
dependency_commits = commits.select{|c| c.data[:dependencies].present? }

activities = []
if dependency_commits.any?
dependency_commits.each do |commit|
dependency_data = commit.data[:dependencies]

dependency_data[:added_manifests].each do |added_manifest|
added_manifest[:added_dependencies].each do |added_dependency|
activities << format_activity(commit, added_manifest, added_dependency, 'added')
end
end

dependency_data[:modified_manifests].each do |modified_manifest|
modified_manifest[:added_dependencies].each do |added_dependency|
activities << format_activity(commit, modified_manifest, added_dependency, 'added')
end

modified_manifest[:modified_dependencies].each do |modified_dependency|
activities << format_activity(commit, modified_manifest, modified_dependency, 'modified')
end

modified_manifest[:removed_dependencies].each do |removed_dependency|
activities << format_activity(commit, modified_manifest, removed_dependency, 'removed')
end
end

dependency_data[:removed_manifests].each do |removed_manifest|
removed_manifest[:removed_dependencies].each do |removed_dependency|
activities << format_activity(commit, removed_manifest, removed_dependency, 'removed')
end
end
end
end

# write activities to the database
DependencyActivity.ar_import(activities.map{|a| DependencyActivity.new(a) })



ensure
# delete code
`rm -rf #{tmp_path}`
end

def find_project_id(project_name, platform)
project_id = Project.platform(platform).where(name: project_name.try(:strip)).limit(1).pluck(:id).first
return project_id if project_id
Project.lower_platform(platform).lower_name(project_name.try(:strip)).limit(1).pluck(:id).first
end

def format_activity(commit, manifest, dependency, action)
{
repository_id: id,
project_id: find_project_id(dependency[:name], manifest[:platform]),
action: action,
project_name: dependency[:name],
commit_message: commit.message,
requirement: dependency[:requirement],
kind: dependency[:type],
manifest_path: manifest[:path],
manifest_kind: manifest[:kind],
commit_sha: commit.sha,
platform: manifest[:platform],
previous_requirement: dependency[:previous_requirement],
previous_kind: dependency[:previous_type],
committed_at: commit.timestamp,
branch: default_branch
}
end
end
4 changes: 4 additions & 0 deletions app/models/dependency_activity.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
class DependencyActivity < ApplicationRecord
belongs_to :repository
belongs_to :project
end
2 changes: 2 additions & 0 deletions app/models/project.rb
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ class Project < ApplicationRecord
has_many :dependent_repositories, -> { group('repositories.id').order('repositories.rank DESC NULLS LAST, repositories.stargazers_count DESC') }, through: :dependent_manifests, source: :repository
has_many :subscriptions
has_many :project_suggestions, dependent: :delete_all
has_many :dependency_activities
belongs_to :repository
has_one :readme, through: :repository

scope :platform, ->(platform) { where(platform: PackageManager::Base.format_name(platform)) }
Expand Down
3 changes: 2 additions & 1 deletion app/models/repository.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ class Repository < ApplicationRecord
include Status
include RepoManifests
include RepositorySourceRank
include DependencyMiner

# eager load this module to avoid clashing with Gitlab gem in development
RepositoryHost::Gitlab
Expand All @@ -23,7 +24,7 @@ class Repository < ApplicationRecord
has_many :dependencies, through: :manifests, source: :repository_dependencies
has_many :dependency_projects, -> { group('projects.id').order("COUNT(projects.id) DESC") }, through: :dependencies, source: :project
has_many :dependency_repos, -> { group('repositories.id') }, through: :dependency_projects, source: :repository

has_many :dependency_activities
has_many :repository_subscriptions, dependent: :delete_all
has_many :web_hooks, dependent: :delete_all
has_many :issues, dependent: :delete_all
Expand Down
9 changes: 9 additions & 0 deletions config/application.rb
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,15 @@
require "action_view/railtie"
require "sprockets/railtie"

require 'activerecord-import/base'

class ActiveRecord::Base
class << self
alias :ar_import :import
remove_method :import
end
end

# Require the gems listed in Gemfile, including any gems
# you've limited to :test, :development, or :production.
Bundler.require(*Rails.groups)
Expand Down
22 changes: 22 additions & 0 deletions db/migrate/20170719162634_create_dependency_activities.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
class CreateDependencyActivities < ActiveRecord::Migration[5.0]
def change
create_table :dependency_activities do |t|
t.references :repository, index: true
t.references :project, index: true
t.string :action
t.string :project_name
t.string :commit_message
t.string :requirement
t.string :kind
t.string :manifest_path
t.string :manifest_kind
t.string :commit_sha
t.string :platform
t.string :previous_requirement
t.string :previous_kind
t.datetime :committed_at, index: true

t.timestamps
end
end
end
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
class AddBranchToDependencyActivities < ActiveRecord::Migration[5.0]
def change
add_column :dependency_activities, :branch, :string
end
end
22 changes: 21 additions & 1 deletion db/schema.rb
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
#
# It's strongly recommended that you check this file into your version control system.

ActiveRecord::Schema.define(version: 20171109154509) do
ActiveRecord::Schema.define(version: 20171127120330) do

# These are extensions that must be enabled in order to support this database
enable_extension "plpgsql"
Expand Down Expand Up @@ -54,6 +54,26 @@
t.datetime "updated_at"
end

create_table "dependency_activities", force: :cascade do |t|
t.integer "repository_id", :index=>{:name=>"index_dependency_activities_on_repository_id"}
t.integer "project_id", :index=>{:name=>"index_dependency_activities_on_project_id"}
t.string "action"
t.string "project_name"
t.string "commit_message"
t.string "requirement"
t.string "kind"
t.string "manifest_path"
t.string "manifest_kind"
t.string "commit_sha"
t.string "platform"
t.string "previous_requirement"
t.string "previous_kind"
t.datetime "committed_at", :index=>{:name=>"index_dependency_activities_on_committed_at"}
t.datetime "created_at", :null=>false
t.datetime "updated_at", :null=>false
t.string "branch"
end

create_table "identities", force: :cascade do |t|
t.string "uid", :index=>{:name=>"index_identities_on_uid"}
t.string "provider"
Expand Down
5 changes: 5 additions & 0 deletions spec/models/dependency_activity_spec.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
require 'rails_helper'

RSpec.describe DependencyActivity, type: :model do
pending "add some examples to (or delete) #{__FILE__}"
end