Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update DOI invalid list #4586

Merged
merged 1 commit into from
May 8, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
28 changes: 20 additions & 8 deletions expandFns.php
Original file line number Diff line number Diff line change
Expand Up @@ -196,18 +196,20 @@
}
$registrant = $matches[1];
// TODO this will need updated over time. See registrant_err_patterns on https://en.wikipedia.org/wiki/Module:Citation/CS1/Identifiers
// 14:43, January 14, 2023 version is last check
// 16:42, November 25, 2023 version is last check
if (strpos($registrant, '10.') === 0) { // We have to deal with valid handles in the DOI field - very rare, so only check actual DOIs
$registrant = substr($registrant,3);
if (preg_match('~^[^1-3]\d\d\d\d\.\d\d*$~', $registrant) || // 5 digits with subcode (0xxxx, 40000+); accepts: 10000–39999
preg_match('~^[^1-6]\d\d\d\d$~', $registrant) || // 5 digits without subcode (0xxxx, 60000+); accepts: 10000–59999
preg_match('~^[^1-7]\d\d\d\d$~', $registrant) || // 5 digits without subcode (0xxxx, 60000+); accepts: 10000–69999
preg_match('~^[^1-9]\d\d\d\.\d\d*$~', $registrant) || // 4 digits with subcode (0xxx); accepts: 1000–9999
preg_match('~^[^1-9]\d\d\d$~', $registrant) || // 4 digits without subcode (0xxx); accepts: 1000–9999
preg_match('~^\d\d\d\d\d\d+~', $registrant) || // 6 or more digits
preg_match('~^\d\d?\d?$~', $registrant) || // less than 4 digits without subcode (3 digits with subcode is legitimate)
preg_match('~^\d\d?\.[\d\.]+~', $registrant) || // 1 or 2 digits with subcode
$registrant === '5555' || // test registrant will never resolve
preg_match('~[^\d\.]~', $registrant)) return false; // any character that isn't a digit or a dot
preg_match('~[^\d\.]~', $registrant)) { // any character that isn't a digit or a dot
return false;
}
}
throttle_dx();

Expand Down Expand Up @@ -1168,15 +1170,21 @@
}
if (preg_match('~^(\d\d?)/(\d\d?)/(\d{4})$~', $string, $matches)) { // dates with slashes
if (intval($matches[1]) < 13 && intval($matches[2]) > 12) {
if (strlen($matches[1]) === 1) $matches[1] = '0' . $matches[1];
if (strlen($matches[1]) === 1) {
$matches[1] = '0' . $matches[1];

Check warning on line 1174 in expandFns.php

View check run for this annotation

Codecov / codecov/patch

expandFns.php#L1174

Added line #L1174 was not covered by tests
}
return $matches[3] . '-' . $matches[1] . '-' . $matches[2];
} elseif (intval($matches[2]) < 13 && intval($matches[1]) > 12) {
if (strlen($matches[2]) === 1) $matches[2] = '0' . $matches[2];
if (strlen($matches[2]) === 1) {
$matches[2] = '0' . $matches[2];

Check warning on line 1179 in expandFns.php

View check run for this annotation

Codecov / codecov/patch

expandFns.php#L1179

Added line #L1179 was not covered by tests
}
return $matches[3] . '-' . $matches[2] . '-' . $matches[1];
} elseif (intval($matches[2]) > 12 && intval($matches[1]) > 12) {
return '';
} elseif ($matches[1] === $matches[2]) {
if (strlen($matches[2]) === 1) $matches[2] = '0' . $matches[2];
if (strlen($matches[2]) === 1) {
$matches[2] = '0' . $matches[2];

Check warning on line 1186 in expandFns.php

View check run for this annotation

Codecov / codecov/patch

expandFns.php#L1186

Added line #L1186 was not covered by tests
}
return $matches[3] . '-' . $matches[2] . '-' . $matches[2];
} else {
return $matches[3];// do not know. just give year
Expand Down Expand Up @@ -1250,7 +1258,9 @@
}

function not_bad_10_1093_doi(string $url): bool { // We assume DOIs are bad, unless on good list
if ($url === '') return true;
if ($url === '') {
return true;
}
if(!preg_match('~10.1093/([^/]+)/~u', $url, $match)) {
return true;
}
Expand Down Expand Up @@ -3065,7 +3075,9 @@
$url .= $part . "&" ;
break;
case "as_epq":
if ($it_is_blank) break;
if ($it_is_blank) {
break;

Check warning on line 3079 in expandFns.php

View check run for this annotation

Codecov / codecov/patch

expandFns.php#L3079

Added line #L3079 was not covered by tests
}
$url .= $part . "&" ;
break;
case "btnG":
Expand Down