sweep.dev
Open in
urlscan Pro
52.58.254.253
Public Scan
Submitted URL: http://sweep.dev/
Effective URL: https://sweep.dev/
Submission: On March 05 via api from US — Scanned from DE
Effective URL: https://sweep.dev/
Submission: On March 05 via api from US — Scanned from DE
Form analysis
0 forms found in the DOMText Content
You need to enable JavaScript to run this app. Star us on GitHub Sweep AIDocumentationAbout UsBlogs Pricing TwitterGithubDiscordEmailEmail SWEEP: UNIT TEST MY DATA PIPELINE AI Junior Developer that handles small features in your codebase Install Sweep Join our Discord Trusted by engineers from Clean up your tech debt, automatically Sweep generates repository-level code at your command. Cut down your dev time on mundane tasks, like tests, documentation, and refactoring. KL Refactor vector_db.py by making get_deeplake_vs_from_repo more modular This PR refactors the get_deeplake_vs_from_repo function in sweepai/core/vector_db.py to make it more modular. The function was quite large and performed multiple tasks, including reading files from a repository, preparing a lexical search index, scoring for vector search, computing all scores, preparing documents, metadatas, and ids, and computing embeddings. sweepai/core/vector_db.py -------------------------------------------------------------------------------- 54... 55 56 logger.info("Recursively getting list of files...") 57 blocked_dirs = get_blocked_dirs(repo) 58 sweep_config.exclude_dirs.extend(blocked_dirs) 59- 60- snippets, file_list = repo_to_chunks(cloned_repo.cache_dir, sweep_config) 61- logger.info(f"Found {len(snippets)} snippets in repository {repo_full_name}") 62- # prepare lexical search 63- index = prepare_index_from_snippets( 64- snippets, len_repo_cache_dir=len(cloned_repo.cache_dir) + 1 65- ) 66- logger.print("Prepared index from snippets") 67+ file_list, snippets, index = prepare_lexical_search_index(cloned_repo, sweep_config, repo_full_name) 68 69 # scoring for vector search 70 files_to_scores = compute_vector_search_scores( 71 file_list, cloned_repo, repo_full_name 72 ) 73 74... 75 76+def prepare_lexical_search_index(cloned_repo, sweep_config, repo_full_name): 77+ snippets, file_list = repo_to_chunks(cloned_repo.cache_dir, sweep_config) 78+ logger.info(f"Found {len(snippets)} snippets in repository {repo_full_name}") 79+ # prepare lexical search 80+ index = prepare_index_from_snippets( 81+ snippets, len_repo_cache_dir=len(cloned_repo.cache_dir) + 1 82+ ) 83+ logger.print("Prepared index from snippets") 84+ return file_list, snippets, index 85 I made a Pull Request for you at #2470! Unit test your new features, automatically Sweep writes unit tests and fixes any broken business logic they catch. See the example kevinlu1248 pushed 1 commit to sweepai/sweep, editing sweepai/utils/graph_test.py Let me write and run a unit test for the new GraphChild component. sweepai/utils/graph_test.py -------------------------------------------------------------------------------- 1class TestGraph(unittest.TestCase): 2 def setUp(self): 3 self.graph = Graph( 4 definitions_graph=nx.DiGraph(), references_graph=nx.DiGraph() 5 ) 6 7 def test_extract_first_degree(self): 8 with patch( 9 "sweepai.utils.graph.Graph.find_definitions" 10 ) as mock_find_definitions, patch( 11 "sweepai.utils.graph.Graph.find_references" 12 ) as mock_find_references, patch( 13 "sweepai.utils.graph.condense_paths" 14 ) as mock_condense_paths: 15 mock_find_definitions.return_value = [["file1", "symbol1", "file2"]] 16 mock_find_references.return_value = [["file1", "symbol1", "file2"]] 17 mock_condense_paths.return_value = [["file1", "symbol1", "file2"]] 18 with patch( 19 "sweepai.utils.graph.Graph.topological_sort" 20 ) as mock_topological_sort: 21 mock_topological_sort.return_value = ["file1", "file2"] 22 result = self.graph.topological_sort(["file1", "file2"]) 23 self.assertEqual(result, ["file1", "file2"]) 24 25... -------------------------------------------------------------------------------- > python -m unittest -v sweepai/utils/graph_test.py Traceback (most recent call last): File "/repo/sweepai/utils/graph_test.py", line 35, in test_extract_first_degree self.assertEqual(result, ["file1", "file2"]) AssertionError: Lists differ: [] != ['file1', 'file2'] Second list contains 2 additional elements. First extra element 0: 'file1' - [] + ['file1', 'file2'] ====================================================================== FAIL: test_paths_to_first_degree_entities (__main__.TestGraph) It looks like there's an edge case when there are empty strings in the path. Let me fix that. sweepai/core/graph_child.py -------------------------------------------------------------------------------- 167class Graph(BaseModel): 168 ... 169 170 def paths_to_first_degree_entities(self, file_paths: list[str]): 171- return "\n".join( 172- [self.extract_first_degree(file_path) for file_path in file_paths] 173- ) 174+ paths = [self.extract_first_degree(file_path) for file_path in file_paths] 175+ if paths and paths[-1] == "": 176+ paths = paths[:-1] 177+ return "\n".join(paths) 178 179 I made a Pull Request for you at #2380! See example tickets, handled by Sweep Example Sweep tickets What our users are saying CE Cody Ebberson Co-founder + CTO at Medplum The true ingenuity lies in their many product intricacies ... from adhering to coding standards and integrating with static analysis, to responsive feedback. Their commitment to getting every detail right, paired with a swift incorporation of lessons into tangible enhancements, really sets Sweep AI apart. AS Ashaya Sharma Co-founder + CTO at Honeycomb.ai I was able to get it to make a PR which took my junior developer 2 days to do. Been great so far. ER Eduardo Reis AI @ Stanford AIMI ✨Wow! Just found sweep.dev from @wwzeng1 @KevinLu45010771 . It wrote these two PRs for edreisMD/plugnplai#91 edreisMD/plugnplai#75 Total lifesaver 🙌. Sweep just saved me 6 hours of work. SP Sagar Patil Product Manager, SSL Zen Sweep helped me fix 2 issues in less than 10 mins. This would have took me at least 30-45 mins manually. I also have to say everything is very fast now. It's working great, just one message and it intelligently understands the problem and suggests a fix that just works! Kudos to you guys! KG Kunal Gupta CEO of Withfriends It’s a little bit like having a junior intern, which doesn’t sound like a lot at first, but you can run like 100 junior interns at once and they can cover a lot of ground in parallel. JE Jeremy Evans Co-founder + CTO at savvy Holy crap, I'm seriously impressed 🤯. Other than one issue it seems to be word-perfect. Exactly how I'd write it, and it understands all our company-specific concepts. Very impressive! 🙌 Develop at ease, with Sweep Get Started GithubDiscordDocsSweep ProBacked by © 2023 Sweep AI, Inc.