Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 6a39daf. Configure here.
| keywords = []string{"confluence", "atlassian", "wiki"} | ||
|
|
||
| // 44-char base64 PAT; decoded form must match the structural check below. | ||
| tokenPat = regexp.MustCompile(detectors.PrefixRegex(keywords) + `\b([A-Za-z0-9+/]{44})\b`) |
There was a problem hiding this comment.
Trailing \b silently misses tokens ending in + or /
Medium Severity
The tokenPat regex uses \b (word boundary) around a character class [A-Za-z0-9+/]{44} that includes + and /. Since \b only considers [A-Za-z0-9_] as word characters, the trailing \b will fail to match whenever the 44th character happens to be + or / and is followed by whitespace, a newline, or end-of-string. This silently drops roughly 3% of structurally valid PATs (2 out of 64 base64 characters). Both test tokens conveniently end in alphanumeric characters (z and K), so this gap isn't caught by tests.
Reviewed by Cursor Bugbot for commit 6a39daf. Configure here.
| for baseURL := range uniqueURLs { | ||
| if invalidHosts.Exists(baseURL) { | ||
| continue | ||
| } |
There was a problem hiding this comment.
Unreachable host handling inconsistently drops subsequent token results
Medium Severity
When errNoHost occurs during verification, the host is cached in invalidHosts but the result is still emitted with a verification error. However, the check at line 122 silently skips any cached invalid host via continue, emitting no result at all. This means the first token encountering an unreachable host gets a result, but subsequent tokens for the same host are silently dropped — not even as token-only results, since len(uniqueURLs) > 0 prevents that fallback. Because Go map iteration is randomized, which token gets reported is non-deterministic. The comparable artifactory detector avoids this by consistently skipping results on errNoHost with continue.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 6a39daf. Configure here.
| keywords = []string{"confluence", "atlassian", "wiki"} | ||
|
|
||
| // 44-char base64 PAT; decoded form must match the structural check below. | ||
| tokenPat = regexp.MustCompile(detectors.PrefixRegex(keywords) + `\b([A-Za-z0-9+/]{44})\b`) |
There was a problem hiding this comment.
Is there any way we can tighten this regex? Although I see that you have an additional check that checks the structure of the pattern, which is good.
Jira also has a similar pattern (same <numeric id>:<random bytes> structure). I found that having this structure also means that the credential will always start with an M, N or O. By any chance is that the same case with Confluence?
| // isStructuralPAT decodes a candidate base64 string and checks that it matches | ||
| // the "<numeric id>:<random bytes>" structure used by Confluence DC PATs: | ||
| // one or more ASCII digits, a colon, then at least one more byte. | ||
| func isStructuralPAT(candidate string) bool { | ||
| raw, err := base64.StdEncoding.DecodeString(candidate) | ||
| if err != nil { | ||
| return false | ||
| } | ||
| colon := bytes.IndexByte(raw, ':') | ||
| if colon <= 0 || colon == len(raw)-1 { | ||
| return false | ||
| } | ||
| for _, b := range raw[:colon] { | ||
| if b < '0' || b > '9' { | ||
| return false | ||
| } | ||
| } | ||
| return true | ||
| } |
There was a problem hiding this comment.
I really like this. It's a great check for noise reduction. Will implement this in my Jira PR as well.
| if len(uniqueURLs) == 0 { | ||
| // Token-only: report unverified since we can't reach a host. | ||
| results = append(results, detectors.Result{ | ||
| DetectorType: detector_typepb.DetectorType_ConfluenceDataCenter, | ||
| Raw: []byte(token), | ||
| RawV2: []byte(token), | ||
| }) | ||
| continue |
There was a problem hiding this comment.
I like the idea of reporting tokens as unverified if no URL was configured/found. Even though it can be misleading in case the token was actually live, it is still better than not reporting it at all.
A suggestion: we can indicate this (as a message in ExtraData maybe?) that we were unable to verify the token because of absence of URL/host. That can serve as a good distinction for the user.
| switch resp.StatusCode { | ||
| case http.StatusOK: | ||
| return true, nil | ||
| case http.StatusUnauthorized, http.StatusForbidden: |
There was a problem hiding this comment.
Are we sure that the API can return both 401 and 403 for an expired/invalid token? Usually it's one of these, and 403 can indicate lack of permissions instead of the credential being invalid.
mustansir14
left a comment
There was a problem hiding this comment.
Overall looks good to me. I have some questions/suggestions which you can look into.
Also it seems the credential pattern for this is the same as Jira Data Center, so there may be some overlap in results. I guess that's okay?


Summary
Adds a detector for Confluence Data Center Personal Access Tokens.
<numeric_id>:<random_bytes>structural shape at the byte levelhttp?://host(:port)?, not justhttps://. On-prem Confluence commonly runs plain HTTP inside corporate networks and on non-standard ports (:8090,:8443).Testing
gock.Checklist:
make test-community)?make lintthis requires golangci-lint)?Note
Medium Risk
Adds a new detector with optional live HTTP verification against self-hosted Confluence instances; risk is mainly false positives/extra network calls and endpoint pairing behavior during scans.
Overview
Adds a new
ConfluenceDataCenterdetector that identifies Confluence Data Center Personal Access Tokens by matching 44-char base64 candidates and post-filtering them via base64 decode to the expected<numeric_id>:<bytes>structure, then optionally pairing them with nearby instance base URLs.When verification is enabled, it validates tokens by calling
GET /rest/api/user/currentwith Bearer auth (caching DNS failures to avoid repeated lookups). The detector is registered in default detector lists and a newDetectorType_ConfluenceDataCenterenum value is added, with unit tests covering pattern extraction, token-only emission, URL pairing, and verification status handling.Reviewed by Cursor Bugbot for commit 6a39daf. Bugbot is set up for automated code reviews on this repo. Configure here.