Skip to content

Commit e5ef4f6

Browse files
committed
fix: update url parser rules
1 parent 4219152 commit e5ef4f6

3 files changed

Lines changed: 112 additions & 43 deletions

File tree

README.md

Lines changed: 86 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,14 @@ A fast code analysis tool for remote repositories with multi-platform support.
99

1010
## Features
1111

12-
- Fast analysis of remote repositories
13-
- Multi-platform support: GitHub, GitLab, Bitbucket, Codeberg
14-
- Multiple output formats: Table, JSON, CSV, XML
15-
- Progress tracking with download speed
16-
- Token-based authentication for private repositories
17-
- Cross-platform: Linux, macOS, Windows
12+
- **Asynchronous Repository Processing**: Implements non-blocking HTTP client with connection pooling and concurrent stream processing for efficient remote repository fetching and decompression
13+
- **Multi-Platform URL Resolution**: Features intelligent URL parsing engine that normalizes different Git hosting platform APIs (GitHub, GitLab, Bitbucket, Codeberg) into unified archive endpoints with branch/commit resolution
14+
- **Streaming Archive Analysis**: Processes tar.gz archives directly in memory using streaming decompression without temporary file extraction, reducing I/O overhead and memory footprint
15+
- **Language Detection Engine**: Implements rule-based file extension and content analysis system supporting 150+ programming languages with configurable pattern matching and statistical computation
16+
- **Real-time Progress Monitoring**: Features bandwidth-aware progress tracking with download speed calculation, ETA estimation, and adaptive UI rendering for terminal environments
17+
- **Structured Data Serialization**: Provides multiple output format engines (Table, JSON, CSV, XML) with schema validation and type-safe serialization for integration with external tools
18+
- **Authentication Layer**: Implements OAuth token management with secure credential handling for accessing private repositories across different hosting platforms
19+
- **Cross-Platform Binary Distribution**: Supports native compilation targets for Linux, macOS, and Windows with platform-specific optimizations and dependency management
1820

1921
## Installation
2022

@@ -44,50 +46,102 @@ bytes-radar [OPTIONS] <URL>
4446

4547
### Examples
4648

49+
#### Basic Repository Analysis
50+
51+
Analyze GitHub repositories using shorthand notation:
52+
4753
```bash
48-
# GitHub repository
4954
bytes-radar torvalds/linux
55+
bytes-radar microsoft/typescript
56+
bytes-radar rust-lang/cargo
57+
```
58+
59+
#### Branch and Commit Targeting
5060

51-
# Specific branch or commit
61+
Specify particular branches or commit hashes for analysis:
62+
63+
```bash
5264
bytes-radar microsoft/vscode@main
53-
bytes-radar rust-lang/rust@abc1234
65+
bytes-radar kubernetes/kubernetes@release-1.28
66+
bytes-radar rust-lang/rust@abc1234567
67+
```
5468

55-
# Other platforms
56-
bytes-radar https://gitlab.com/user/repo
57-
bytes-radar https://bitbucket.org/user/repo
69+
#### Multi-Platform Repository Support
5870

59-
# Output formats
71+
Analyze repositories from different Git hosting platforms:
72+
73+
```bash
74+
bytes-radar https://gitlab.com/gitlab-org/gitlab
75+
bytes-radar https://bitbucket.org/atlassian/stash
76+
bytes-radar https://codeberg.org/forgejo/forgejo
77+
```
78+
79+
#### Output Format Configuration
80+
81+
Generate analysis results in structured data formats:
82+
83+
```bash
6084
bytes-radar -f json torvalds/linux
61-
bytes-radar -f csv user/repo
85+
bytes-radar -f csv microsoft/typescript
86+
bytes-radar -f xml rust-lang/cargo
87+
```
88+
89+
#### Private Repository Access
6290

63-
# Private repositories
64-
bytes-radar --token ghp_xxx private/repo
91+
Authenticate with platform tokens for private repository analysis:
6592

66-
# Minimal output
67-
bytes-radar --quiet user/repo
93+
```bash
94+
bytes-radar --token ghp_xxxxxxxxxxxxxxxxxxxx private-org/confidential-repo
95+
bytes-radar --token glpat-xxxxxxxxxxxxxxxxxxxx https://gitlab.com/private-group/project
96+
```
97+
98+
#### Performance and Output Control
99+
100+
Configure analysis behavior and output verbosity:
101+
102+
```bash
103+
bytes-radar --quiet --no-progress user/repo
104+
bytes-radar --timeout 600 --detailed large-org/massive-repo
68105
```
69106

70107
## Output Formats
71108

72109
### Table (Default)
73-
```
110+
```shell
111+
$ bytes-radar torvalds/linux
112+
Analyzing: https://github.com/torvalds/linux
113+
Analysis completed in 126.36s
114+
74115
================================================================================
75-
Project linux@master
76-
Total Files 75,823
77-
Total Lines 28,691,744
78-
Code Lines 22,453,891
79-
Comment Lines 3,891,234
80-
Blank Lines 2,346,619
81-
Languages 42
116+
Project linux@main
117+
Total Files 89,639
118+
Total Lines 40,876,027
119+
Code Lines 31,293,116
120+
Comment Lines 4,433,479
121+
Blank Lines 5,149,432
122+
Languages 14
82123
Primary Language C
83-
Code Ratio 78.3%
84-
Documentation 13.6%
124+
Code Ratio 76.6%
125+
Documentation 14.2%
85126
================================================================================
86-
Language Files Lines Code Comments Blanks Share%
127+
Language Files Lines Code Comments Blanks Share%
87128
================================================================================
88-
C 14,523 18,234,567 15,234 1,234,567 1,765,766 63.6%
89-
Assembly 2,341 3,456,789 2,891 234,567 321,331 12.0%
90-
...
129+
C 35,586 25,268,107 18,782,347 2,836,806 3,648,954 61.8%
130+
CHeader 25,845 10,247,647 7,953,679 1,528,043 765,925 25.1%
131+
Text 20,954 3,917,052 3,324,410 0 592,642 9.6%
132+
Json 961 572,657 572,655 0 2 1.4%
133+
Yaml 4,862 548,408 436,698 22,250 89,460 1.3%
134+
Sh 960 189,965 132,288 23,686 33,991 0.5%
135+
Python 293 89,285 69,449 5,770 14,066 0.2%
136+
Rust 158 39,561 19,032 16,697 3,832 0.1%
137+
Cpp 7 2,267 1,836 96 335 0.0%
138+
Markdown 3 578 436 0 142 0.0%
139+
Css 3 295 172 69 54 0.0%
140+
CppHeader 2 125 59 47 19 0.0%
141+
Toml 3 47 28 12 7 0.0%
142+
Html 2 33 27 3 3 0.0%
143+
================================================================================
144+
Total 89,639 40,876,027 31,293,116 4,433,479 5,149,432 100.0%
91145
```
92146

93147
### JSON Output
@@ -178,7 +232,3 @@ cargo fmt
178232
# Lint code
179233
cargo clippy --all-targets --all-features
180234
```
181-
182-
## Acknowledgments
183-
184-
- Inspired by [tokei](https://github.com/XAMPPRocky/tokei)

src/cli/url_parser.rs

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -3,19 +3,22 @@ pub fn expand_url(url: &str) -> String {
33
return url.to_string();
44
}
55

6-
if url.contains('/') && !url.contains('.') {
6+
if url.contains('/') && !url.starts_with("http://") && !url.starts_with("https://") {
77
let parts: Vec<&str> = url.split('@').collect();
88
let repo_part = parts[0];
99
let branch_or_commit = parts.get(1);
1010

11-
if let Some(branch) = branch_or_commit {
12-
if branch.len() >= 7 && branch.chars().all(|c| c.is_ascii_hexdigit()) {
13-
return format!("https://github.com/{}/commit/{}", repo_part, branch);
11+
let path_parts: Vec<&str> = repo_part.split('/').collect();
12+
if path_parts.len() == 2 {
13+
if let Some(branch) = branch_or_commit {
14+
if branch.len() >= 7 && branch.chars().all(|c| c.is_ascii_hexdigit()) {
15+
return format!("https://github.com/{}/commit/{}", repo_part, branch);
16+
} else {
17+
return format!("https://github.com/{}/tree/{}", repo_part, branch);
18+
}
1419
} else {
15-
return format!("https://github.com/{}/tree/{}", repo_part, branch);
20+
return format!("https://github.com/{}", repo_part);
1621
}
17-
} else {
18-
return format!("https://github.com/{}", repo_part);
1922
}
2023
}
2124

src/core/net.rs

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,22 @@ impl RemoteAnalyzer {
9393
return Ok(url.to_string());
9494
}
9595

96+
if url.starts_with("http://") || url.starts_with("https://") {
97+
if !url.contains("github.com")
98+
&& !url.contains("gitlab.com")
99+
&& !url.contains("gitlab.")
100+
&& !url.contains("bitbucket.org")
101+
&& !url.contains("codeberg.org")
102+
{
103+
let tarball_url = if url.ends_with(".tar.gz") || url.ends_with(".tgz") {
104+
url.to_string()
105+
} else {
106+
format!("{}.tar.gz", url)
107+
};
108+
return Ok(tarball_url);
109+
}
110+
}
111+
96112
let branches = ["main", "master", "develop", "dev"];
97113

98114
if let Some(github_url) = self.parse_github_url_with_branch(url) {

0 commit comments

Comments
 (0)