implement url cleaning based on ClearURLs rules and logic
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"printWidth": 140,
|
||||
"singleQuote": true,
|
||||
"singleQuote": false,
|
||||
"semi": true,
|
||||
"useTabs": true
|
||||
}
|
||||
|
115
CLAUDE.md
Normal file
115
CLAUDE.md
Normal file
@@ -0,0 +1,115 @@
|
||||
# CLAUDE.md
|
||||
|
||||
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||
|
||||
## Project Overview
|
||||
|
||||
This is a Cloudflare Workers project called "url-cleaner" built with TypeScript. It's a serverless URL cleaning service that removes tracking parameters from URLs, similar to the ClearURLs browser extension. The worker runs on Cloudflare's edge network for fast global performance.
|
||||
|
||||
## Architecture
|
||||
|
||||
- **Runtime**: Cloudflare Workers (serverless edge computing)
|
||||
- **Language**: TypeScript with ES2021 target
|
||||
- **Entry Point**: `src/index.ts` - main worker handler accepting `?url=` parameter
|
||||
- **URL Cleaner**: `src/cleaner.ts` - core cleaning engine with redirect following and fragment cleaning
|
||||
- **Rules Cache**: `src/rules-cache.ts` - Durable Object for caching ClearURLs rules with SHA256 validation
|
||||
- **Types**: `src/types.ts` - TypeScript interfaces for ClearURLs rule structure
|
||||
- **Configuration**: `wrangler.jsonc` - Cloudflare Workers deployment configuration with Durable Objects
|
||||
- **Testing**: Vitest with Cloudflare Workers testing pool
|
||||
|
||||
## API Usage
|
||||
|
||||
**Endpoint**: `GET /?url=<encoded-url>`
|
||||
**Response**: Plain text containing the cleaned URL
|
||||
|
||||
**Examples**:
|
||||
|
||||
- `/?url=https://example.com?utm_source=test` → `https://example.com`
|
||||
- `/?url=https://youtube.com/watch?v=abc&feature=share` → `https://youtube.com/watch?v=abc`
|
||||
|
||||
## Common Commands
|
||||
|
||||
### Development
|
||||
|
||||
- `npm run dev` or `npm run start` - Start local development server with Wrangler
|
||||
- `npm run deploy` - Deploy worker to Cloudflare
|
||||
- `npm test` - Run tests with Vitest
|
||||
- `npm run format` - Format code with Prettier
|
||||
- `npm run format:check` - Check code formatting
|
||||
- `npm run cf-typegen` - Generate TypeScript types for Cloudflare bindings
|
||||
|
||||
### Development Server
|
||||
|
||||
The development server runs on `http://localhost:8787/` by default.
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
src/
|
||||
├── index.ts # Main worker entry point with fetch handler
|
||||
├── cleaner.ts # Core URL cleaning engine with redirect following and fragment cleaning
|
||||
├── rules-cache.ts # Durable Object for caching ClearURLs rules with validation
|
||||
└── types.ts # TypeScript interfaces for ClearURLs rule structure
|
||||
test/
|
||||
├── index.spec.ts # Unit and integration tests with ClearURLs rule mocking
|
||||
├── tsconfig.json # Test-specific TypeScript config
|
||||
└── env.d.ts # Test environment types
|
||||
```
|
||||
|
||||
## URL Cleaning Features
|
||||
|
||||
### Core Functionality
|
||||
|
||||
- **Official ClearURLs Rules**: Uses the same rule database as ClearURLs browser extension (250+ providers)
|
||||
- **Redirect Following**: Follows up to 5 redirects to unwrap shortened URLs and tracking redirects
|
||||
- **Provider-Specific Rules**: Domain-specific rules prevent false positives and preserve functionality
|
||||
- **Query Parameter Cleaning**: Removes tracking from URL query parameters (`?utm_source=test`)
|
||||
- **Fragment Cleaning**: Removes tracking from URL fragments/hash parameters (`#utm_campaign=test`)
|
||||
- **Raw Rules Support**: Handles complex regex-based cleaning for advanced tracking patterns
|
||||
- **Loop Prevention**: Tracks visited URLs to prevent infinite redirect loops
|
||||
- **Response Caching**: 1-hour cache for improved performance
|
||||
|
||||
### Rule System
|
||||
|
||||
- **Live Updates**: Fetches rules from ClearURLs official API (`https://rules2.clearurls.xyz/`)
|
||||
- **SHA256 Validation**: Cryptographically verifies rule integrity
|
||||
- **Durable Object Caching**: 7-day cache with automatic refresh
|
||||
- **Fallback Support**: Uses cached rules if fresh fetch fails
|
||||
- **Provider Priority**: Domain-specific providers take precedence over global rules
|
||||
|
||||
### Supported Providers (250+ total)
|
||||
|
||||
- **Google**: Search, Analytics, Ads tracking removal
|
||||
- **YouTube**: Preserves video/playlist IDs, removes tracking
|
||||
- **Amazon**: E-commerce functionality preserved, affiliate tracking removed
|
||||
- **Facebook/Meta**: Social features preserved, tracking removed
|
||||
- **Twitter/X**: Post/user functionality preserved, tracking removed
|
||||
- **TikTok**: Video functionality preserved, tracking removed
|
||||
- **And 240+ more providers** maintained by the ClearURLs community
|
||||
|
||||
### Advanced Features
|
||||
|
||||
- **Exception Handling**: Respects provider-specific exception rules
|
||||
- **Redirection Unwrapping**: Follows ClearURLs redirection patterns
|
||||
- **Regex Pattern Matching**: Exact ClearURLs compatibility with `^rule$` pattern matching
|
||||
- **Case-Insensitive Matching**: Handles mixed-case tracking parameters
|
||||
- **Fragment Parameter Support**: Cleans tracking from URL hash fragments
|
||||
|
||||
## Key Files
|
||||
|
||||
- `wrangler.jsonc` - Worker configuration including compatibility date and observability
|
||||
- `vitest.config.mts` - Test configuration using Cloudflare Workers pool
|
||||
- `worker-configuration.d.ts` - Generated TypeScript definitions for Cloudflare bindings
|
||||
- `tsconfig.json` - TypeScript configuration with strict mode enabled
|
||||
|
||||
## Testing
|
||||
|
||||
The project uses Vitest with the Cloudflare Workers testing pool (`@cloudflare/vitest-pool-workers`). Tests support both unit testing (with mocked context) and integration testing (using `SELF.fetch()`).
|
||||
|
||||
## TypeScript Configuration
|
||||
|
||||
- Target: ES2021
|
||||
- Module: ES2022
|
||||
- Strict mode enabled
|
||||
- JSX support configured for React
|
||||
- Excludes test files from main compilation
|
165
LICENSE
Normal file
165
LICENSE
Normal file
@@ -0,0 +1,165 @@
|
||||
GNU LESSER GENERAL PUBLIC LICENSE
|
||||
Version 3, 29 June 2007
|
||||
|
||||
Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
|
||||
Everyone is permitted to copy and distribute verbatim copies
|
||||
of this license document, but changing it is not allowed.
|
||||
|
||||
|
||||
This version of the GNU Lesser General Public License incorporates
|
||||
the terms and conditions of version 3 of the GNU General Public
|
||||
License, supplemented by the additional permissions listed below.
|
||||
|
||||
0. Additional Definitions.
|
||||
|
||||
As used herein, "this License" refers to version 3 of the GNU Lesser
|
||||
General Public License, and the "GNU GPL" refers to version 3 of the GNU
|
||||
General Public License.
|
||||
|
||||
"The Library" refers to a covered work governed by this License,
|
||||
other than an Application or a Combined Work as defined below.
|
||||
|
||||
An "Application" is any work that makes use of an interface provided
|
||||
by the Library, but which is not otherwise based on the Library.
|
||||
Defining a subclass of a class defined by the Library is deemed a mode
|
||||
of using an interface provided by the Library.
|
||||
|
||||
A "Combined Work" is a work produced by combining or linking an
|
||||
Application with the Library. The particular version of the Library
|
||||
with which the Combined Work was made is also called the "Linked
|
||||
Version".
|
||||
|
||||
The "Minimal Corresponding Source" for a Combined Work means the
|
||||
Corresponding Source for the Combined Work, excluding any source code
|
||||
for portions of the Combined Work that, considered in isolation, are
|
||||
based on the Application, and not on the Linked Version.
|
||||
|
||||
The "Corresponding Application Code" for a Combined Work means the
|
||||
object code and/or source code for the Application, including any data
|
||||
and utility programs needed for reproducing the Combined Work from the
|
||||
Application, but excluding the System Libraries of the Combined Work.
|
||||
|
||||
1. Exception to Section 3 of the GNU GPL.
|
||||
|
||||
You may convey a covered work under sections 3 and 4 of this License
|
||||
without being bound by section 3 of the GNU GPL.
|
||||
|
||||
2. Conveying Modified Versions.
|
||||
|
||||
If you modify a copy of the Library, and, in your modifications, a
|
||||
facility refers to a function or data to be supplied by an Application
|
||||
that uses the facility (other than as an argument passed when the
|
||||
facility is invoked), then you may convey a copy of the modified
|
||||
version:
|
||||
|
||||
a) under this License, provided that you make a good faith effort to
|
||||
ensure that, in the event an Application does not supply the
|
||||
function or data, the facility still operates, and performs
|
||||
whatever part of its purpose remains meaningful, or
|
||||
|
||||
b) under the GNU GPL, with none of the additional permissions of
|
||||
this License applicable to that copy.
|
||||
|
||||
3. Object Code Incorporating Material from Library Header Files.
|
||||
|
||||
The object code form of an Application may incorporate material from
|
||||
a header file that is part of the Library. You may convey such object
|
||||
code under terms of your choice, provided that, if the incorporated
|
||||
material is not limited to numerical parameters, data structure
|
||||
layouts and accessors, or small macros, inline functions and templates
|
||||
(ten or fewer lines in length), you do both of the following:
|
||||
|
||||
a) Give prominent notice with each copy of the object code that the
|
||||
Library is used in it and that the Library and its use are
|
||||
covered by this License.
|
||||
|
||||
b) Accompany the object code with a copy of the GNU GPL and this license
|
||||
document.
|
||||
|
||||
4. Combined Works.
|
||||
|
||||
You may convey a Combined Work under terms of your choice that,
|
||||
taken together, effectively do not restrict modification of the
|
||||
portions of the Library contained in the Combined Work and reverse
|
||||
engineering for debugging such modifications, if you also do each of
|
||||
the following:
|
||||
|
||||
a) Give prominent notice with each copy of the Combined Work that
|
||||
the Library is used in it and that the Library and its use are
|
||||
covered by this License.
|
||||
|
||||
b) Accompany the Combined Work with a copy of the GNU GPL and this license
|
||||
document.
|
||||
|
||||
c) For a Combined Work that displays copyright notices during
|
||||
execution, include the copyright notice for the Library among
|
||||
these notices, as well as a reference directing the user to the
|
||||
copies of the GNU GPL and this license document.
|
||||
|
||||
d) Do one of the following:
|
||||
|
||||
0) Convey the Minimal Corresponding Source under the terms of this
|
||||
License, and the Corresponding Application Code in a form
|
||||
suitable for, and under terms that permit, the user to
|
||||
recombine or relink the Application with a modified version of
|
||||
the Linked Version to produce a modified Combined Work, in the
|
||||
manner specified by section 6 of the GNU GPL for conveying
|
||||
Corresponding Source.
|
||||
|
||||
1) Use a suitable shared library mechanism for linking with the
|
||||
Library. A suitable mechanism is one that (a) uses at run time
|
||||
a copy of the Library already present on the user's computer
|
||||
system, and (b) will operate properly with a modified version
|
||||
of the Library that is interface-compatible with the Linked
|
||||
Version.
|
||||
|
||||
e) Provide Installation Information, but only if you would otherwise
|
||||
be required to provide such information under section 6 of the
|
||||
GNU GPL, and only to the extent that such information is
|
||||
necessary to install and execute a modified version of the
|
||||
Combined Work produced by recombining or relinking the
|
||||
Application with a modified version of the Linked Version. (If
|
||||
you use option 4d0, the Installation Information must accompany
|
||||
the Minimal Corresponding Source and Corresponding Application
|
||||
Code. If you use option 4d1, you must provide the Installation
|
||||
Information in the manner specified by section 6 of the GNU GPL
|
||||
for conveying Corresponding Source.)
|
||||
|
||||
5. Combined Libraries.
|
||||
|
||||
You may place library facilities that are a work based on the
|
||||
Library side by side in a single library together with other library
|
||||
facilities that are not Applications and are not covered by this
|
||||
License, and convey such a combined library under terms of your
|
||||
choice, if you do both of the following:
|
||||
|
||||
a) Accompany the combined library with a copy of the same work based
|
||||
on the Library, uncombined with any other library facilities,
|
||||
conveyed under the terms of this License.
|
||||
|
||||
b) Give prominent notice with the combined library that part of it
|
||||
is a work based on the Library, and explaining where to find the
|
||||
accompanying uncombined form of the same work.
|
||||
|
||||
6. Revised Versions of the GNU Lesser General Public License.
|
||||
|
||||
The Free Software Foundation may publish revised and/or new versions
|
||||
of the GNU Lesser General Public License from time to time. Such new
|
||||
versions will be similar in spirit to the present version, but may
|
||||
differ in detail to address new problems or concerns.
|
||||
|
||||
Each version is given a distinguishing version number. If the
|
||||
Library as you received it specifies that a certain numbered version
|
||||
of the GNU Lesser General Public License "or any later version"
|
||||
applies to it, you have the option of following the terms and
|
||||
conditions either of that published version or of any later version
|
||||
published by the Free Software Foundation. If the Library as you
|
||||
received it does not specify a version number of the GNU Lesser
|
||||
General Public License, you may choose any version of the GNU Lesser
|
||||
General Public License ever published by the Free Software Foundation.
|
||||
|
||||
If the Library as you received it specifies that a proxy can decide
|
||||
whether future versions of the GNU Lesser General Public License shall
|
||||
apply, that proxy's public statement of acceptance of any version is
|
||||
permanent authorization for you to choose that version for the
|
||||
Library.
|
282
package-lock.json
generated
282
package-lock.json
generated
@@ -9,6 +9,7 @@
|
||||
"version": "0.0.0",
|
||||
"devDependencies": {
|
||||
"@cloudflare/vitest-pool-workers": "^0.8.19",
|
||||
"prettier": "^3.6.2",
|
||||
"typescript": "^5.5.2",
|
||||
"vitest": "~3.2.0",
|
||||
"wrangler": "^4.38.0"
|
||||
@@ -80,6 +81,91 @@
|
||||
}
|
||||
}
|
||||
},
|
||||
"node_modules/@cloudflare/vitest-pool-workers/node_modules/@cloudflare/workerd-darwin-64": {
|
||||
"version": "1.20250906.0",
|
||||
"resolved": "https://registry.npmjs.org/@cloudflare/workerd-darwin-64/-/workerd-darwin-64-1.20250906.0.tgz",
|
||||
"integrity": "sha512-E+X/YYH9BmX0ew2j/mAWFif2z05NMNuhCTlNYEGLkqMe99K15UewBqajL9pMcMUKxylnlrEoK3VNxl33DkbnPA==",
|
||||
"cpu": [
|
||||
"x64"
|
||||
],
|
||||
"dev": true,
|
||||
"license": "Apache-2.0",
|
||||
"optional": true,
|
||||
"os": [
|
||||
"darwin"
|
||||
],
|
||||
"engines": {
|
||||
"node": ">=16"
|
||||
}
|
||||
},
|
||||
"node_modules/@cloudflare/vitest-pool-workers/node_modules/@cloudflare/workerd-darwin-arm64": {
|
||||
"version": "1.20250906.0",
|
||||
"resolved": "https://registry.npmjs.org/@cloudflare/workerd-darwin-arm64/-/workerd-darwin-arm64-1.20250906.0.tgz",
|
||||
"integrity": "sha512-X5apsZ1SFW4FYTM19ISHf8005FJMPfrcf4U5rO0tdj+TeJgQgXuZ57IG0WeW7SpLVeBo8hM6WC8CovZh41AfnA==",
|
||||
"cpu": [
|
||||
"arm64"
|
||||
],
|
||||
"dev": true,
|
||||
"license": "Apache-2.0",
|
||||
"optional": true,
|
||||
"os": [
|
||||
"darwin"
|
||||
],
|
||||
"engines": {
|
||||
"node": ">=16"
|
||||
}
|
||||
},
|
||||
"node_modules/@cloudflare/vitest-pool-workers/node_modules/@cloudflare/workerd-linux-64": {
|
||||
"version": "1.20250906.0",
|
||||
"resolved": "https://registry.npmjs.org/@cloudflare/workerd-linux-64/-/workerd-linux-64-1.20250906.0.tgz",
|
||||
"integrity": "sha512-rlKzWgsLnlQ5Nt9W69YBJKcmTmZbOGu0edUsenXPmc6wzULUxoQpi7ZE9k3TfTonJx4WoQsQlzCUamRYFsX+0Q==",
|
||||
"cpu": [
|
||||
"x64"
|
||||
],
|
||||
"dev": true,
|
||||
"license": "Apache-2.0",
|
||||
"optional": true,
|
||||
"os": [
|
||||
"linux"
|
||||
],
|
||||
"engines": {
|
||||
"node": ">=16"
|
||||
}
|
||||
},
|
||||
"node_modules/@cloudflare/vitest-pool-workers/node_modules/@cloudflare/workerd-linux-arm64": {
|
||||
"version": "1.20250906.0",
|
||||
"resolved": "https://registry.npmjs.org/@cloudflare/workerd-linux-arm64/-/workerd-linux-arm64-1.20250906.0.tgz",
|
||||
"integrity": "sha512-DdedhiQ+SeLzpg7BpcLrIPEZ33QKioJQ1wvL4X7nuLzEB9rWzS37NNNahQzc1+44rhG4fyiHbXBPOeox4B9XVA==",
|
||||
"cpu": [
|
||||
"arm64"
|
||||
],
|
||||
"dev": true,
|
||||
"license": "Apache-2.0",
|
||||
"optional": true,
|
||||
"os": [
|
||||
"linux"
|
||||
],
|
||||
"engines": {
|
||||
"node": ">=16"
|
||||
}
|
||||
},
|
||||
"node_modules/@cloudflare/vitest-pool-workers/node_modules/@cloudflare/workerd-windows-64": {
|
||||
"version": "1.20250906.0",
|
||||
"resolved": "https://registry.npmjs.org/@cloudflare/workerd-windows-64/-/workerd-windows-64-1.20250906.0.tgz",
|
||||
"integrity": "sha512-Q8Qjfs8jGVILnZL6vUpQ90q/8MTCYaGR3d1LGxZMBqte8Vr7xF3KFHPEy7tFs0j0mMjnqCYzlofmPNY+9ZaDRg==",
|
||||
"cpu": [
|
||||
"x64"
|
||||
],
|
||||
"dev": true,
|
||||
"license": "Apache-2.0",
|
||||
"optional": true,
|
||||
"os": [
|
||||
"win32"
|
||||
],
|
||||
"engines": {
|
||||
"node": ">=16"
|
||||
}
|
||||
},
|
||||
"node_modules/@cloudflare/vitest-pool-workers/node_modules/@esbuild/aix-ppc64": {
|
||||
"version": "0.25.4",
|
||||
"resolved": "https://registry.npmjs.org/@esbuild/aix-ppc64/-/aix-ppc64-0.25.4.tgz",
|
||||
@@ -546,6 +632,27 @@
|
||||
"@esbuild/win32-x64": "0.25.4"
|
||||
}
|
||||
},
|
||||
"node_modules/@cloudflare/vitest-pool-workers/node_modules/workerd": {
|
||||
"version": "1.20250906.0",
|
||||
"resolved": "https://registry.npmjs.org/workerd/-/workerd-1.20250906.0.tgz",
|
||||
"integrity": "sha512-ryVyEaqXPPsr/AxccRmYZZmDAkfQVjhfRqrNTlEeN8aftBk6Ca1u7/VqmfOayjCXrA+O547TauebU+J3IpvFXw==",
|
||||
"dev": true,
|
||||
"hasInstallScript": true,
|
||||
"license": "Apache-2.0",
|
||||
"bin": {
|
||||
"workerd": "bin/workerd"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=16"
|
||||
},
|
||||
"optionalDependencies": {
|
||||
"@cloudflare/workerd-darwin-64": "1.20250906.0",
|
||||
"@cloudflare/workerd-darwin-arm64": "1.20250906.0",
|
||||
"@cloudflare/workerd-linux-64": "1.20250906.0",
|
||||
"@cloudflare/workerd-linux-arm64": "1.20250906.0",
|
||||
"@cloudflare/workerd-windows-64": "1.20250906.0"
|
||||
}
|
||||
},
|
||||
"node_modules/@cloudflare/vitest-pool-workers/node_modules/wrangler": {
|
||||
"version": "4.35.0",
|
||||
"resolved": "https://registry.npmjs.org/wrangler/-/wrangler-4.35.0.tgz",
|
||||
@@ -582,9 +689,9 @@
|
||||
}
|
||||
},
|
||||
"node_modules/@cloudflare/workerd-darwin-64": {
|
||||
"version": "1.20250906.0",
|
||||
"resolved": "https://registry.npmjs.org/@cloudflare/workerd-darwin-64/-/workerd-darwin-64-1.20250906.0.tgz",
|
||||
"integrity": "sha512-E+X/YYH9BmX0ew2j/mAWFif2z05NMNuhCTlNYEGLkqMe99K15UewBqajL9pMcMUKxylnlrEoK3VNxl33DkbnPA==",
|
||||
"version": "1.20250920.0",
|
||||
"resolved": "https://registry.npmjs.org/@cloudflare/workerd-darwin-64/-/workerd-darwin-64-1.20250920.0.tgz",
|
||||
"integrity": "sha512-IoZtLRBJ5vkOPZuSGgJN51YGBPn8h2R/rvr3OQWvomvc6zfZXJG2h8bkdaDEMQdiuys9wyXYQdYQ4NSFM4a2+A==",
|
||||
"cpu": [
|
||||
"x64"
|
||||
],
|
||||
@@ -594,14 +701,15 @@
|
||||
"os": [
|
||||
"darwin"
|
||||
],
|
||||
"peer": true,
|
||||
"engines": {
|
||||
"node": ">=16"
|
||||
}
|
||||
},
|
||||
"node_modules/@cloudflare/workerd-darwin-arm64": {
|
||||
"version": "1.20250906.0",
|
||||
"resolved": "https://registry.npmjs.org/@cloudflare/workerd-darwin-arm64/-/workerd-darwin-arm64-1.20250906.0.tgz",
|
||||
"integrity": "sha512-X5apsZ1SFW4FYTM19ISHf8005FJMPfrcf4U5rO0tdj+TeJgQgXuZ57IG0WeW7SpLVeBo8hM6WC8CovZh41AfnA==",
|
||||
"version": "1.20250920.0",
|
||||
"resolved": "https://registry.npmjs.org/@cloudflare/workerd-darwin-arm64/-/workerd-darwin-arm64-1.20250920.0.tgz",
|
||||
"integrity": "sha512-4JLwIaJ5qAjeDj0WmtOC06y/2C2QKUtU5moD/tQGGwXJ4RwIstbPl9AfskgHdm3nUg0O/Pe0EaohARM7mQFzlA==",
|
||||
"cpu": [
|
||||
"arm64"
|
||||
],
|
||||
@@ -611,14 +719,15 @@
|
||||
"os": [
|
||||
"darwin"
|
||||
],
|
||||
"peer": true,
|
||||
"engines": {
|
||||
"node": ">=16"
|
||||
}
|
||||
},
|
||||
"node_modules/@cloudflare/workerd-linux-64": {
|
||||
"version": "1.20250906.0",
|
||||
"resolved": "https://registry.npmjs.org/@cloudflare/workerd-linux-64/-/workerd-linux-64-1.20250906.0.tgz",
|
||||
"integrity": "sha512-rlKzWgsLnlQ5Nt9W69YBJKcmTmZbOGu0edUsenXPmc6wzULUxoQpi7ZE9k3TfTonJx4WoQsQlzCUamRYFsX+0Q==",
|
||||
"version": "1.20250920.0",
|
||||
"resolved": "https://registry.npmjs.org/@cloudflare/workerd-linux-64/-/workerd-linux-64-1.20250920.0.tgz",
|
||||
"integrity": "sha512-dWdSaqKPcfdSxa386fkZrNAGwCe+C/BQgLoPDpvjq//NK+9mzwK1Cv6URo1GobmGReBK67lsFQr50/ekFGb53A==",
|
||||
"cpu": [
|
||||
"x64"
|
||||
],
|
||||
@@ -628,14 +737,15 @@
|
||||
"os": [
|
||||
"linux"
|
||||
],
|
||||
"peer": true,
|
||||
"engines": {
|
||||
"node": ">=16"
|
||||
}
|
||||
},
|
||||
"node_modules/@cloudflare/workerd-linux-arm64": {
|
||||
"version": "1.20250906.0",
|
||||
"resolved": "https://registry.npmjs.org/@cloudflare/workerd-linux-arm64/-/workerd-linux-arm64-1.20250906.0.tgz",
|
||||
"integrity": "sha512-DdedhiQ+SeLzpg7BpcLrIPEZ33QKioJQ1wvL4X7nuLzEB9rWzS37NNNahQzc1+44rhG4fyiHbXBPOeox4B9XVA==",
|
||||
"version": "1.20250920.0",
|
||||
"resolved": "https://registry.npmjs.org/@cloudflare/workerd-linux-arm64/-/workerd-linux-arm64-1.20250920.0.tgz",
|
||||
"integrity": "sha512-NX9BdgC1bL7UvWEnoc34u3oDdaLQWfWL3hLOK0rcgaK9rivbJW/sNFy1WGaSGBwcyUnaWAseQ/SqJhibrHJUVw==",
|
||||
"cpu": [
|
||||
"arm64"
|
||||
],
|
||||
@@ -645,14 +755,15 @@
|
||||
"os": [
|
||||
"linux"
|
||||
],
|
||||
"peer": true,
|
||||
"engines": {
|
||||
"node": ">=16"
|
||||
}
|
||||
},
|
||||
"node_modules/@cloudflare/workerd-windows-64": {
|
||||
"version": "1.20250906.0",
|
||||
"resolved": "https://registry.npmjs.org/@cloudflare/workerd-windows-64/-/workerd-windows-64-1.20250906.0.tgz",
|
||||
"integrity": "sha512-Q8Qjfs8jGVILnZL6vUpQ90q/8MTCYaGR3d1LGxZMBqte8Vr7xF3KFHPEy7tFs0j0mMjnqCYzlofmPNY+9ZaDRg==",
|
||||
"version": "1.20250920.0",
|
||||
"resolved": "https://registry.npmjs.org/@cloudflare/workerd-windows-64/-/workerd-windows-64-1.20250920.0.tgz",
|
||||
"integrity": "sha512-71Ef7fu/bh9GSA/wjqCgXbnWbkatm5KBJwf+3UB/DR7lU8+DHy3MVrXY3vPRk46ICT/C2EAoBZ4AKnjhiIF86w==",
|
||||
"cpu": [
|
||||
"x64"
|
||||
],
|
||||
@@ -662,6 +773,7 @@
|
||||
"os": [
|
||||
"win32"
|
||||
],
|
||||
"peer": true,
|
||||
"engines": {
|
||||
"node": ">=16"
|
||||
}
|
||||
@@ -2457,6 +2569,112 @@
|
||||
"node": ">=18.0.0"
|
||||
}
|
||||
},
|
||||
"node_modules/miniflare/node_modules/@cloudflare/workerd-darwin-64": {
|
||||
"version": "1.20250906.0",
|
||||
"resolved": "https://registry.npmjs.org/@cloudflare/workerd-darwin-64/-/workerd-darwin-64-1.20250906.0.tgz",
|
||||
"integrity": "sha512-E+X/YYH9BmX0ew2j/mAWFif2z05NMNuhCTlNYEGLkqMe99K15UewBqajL9pMcMUKxylnlrEoK3VNxl33DkbnPA==",
|
||||
"cpu": [
|
||||
"x64"
|
||||
],
|
||||
"dev": true,
|
||||
"license": "Apache-2.0",
|
||||
"optional": true,
|
||||
"os": [
|
||||
"darwin"
|
||||
],
|
||||
"engines": {
|
||||
"node": ">=16"
|
||||
}
|
||||
},
|
||||
"node_modules/miniflare/node_modules/@cloudflare/workerd-darwin-arm64": {
|
||||
"version": "1.20250906.0",
|
||||
"resolved": "https://registry.npmjs.org/@cloudflare/workerd-darwin-arm64/-/workerd-darwin-arm64-1.20250906.0.tgz",
|
||||
"integrity": "sha512-X5apsZ1SFW4FYTM19ISHf8005FJMPfrcf4U5rO0tdj+TeJgQgXuZ57IG0WeW7SpLVeBo8hM6WC8CovZh41AfnA==",
|
||||
"cpu": [
|
||||
"arm64"
|
||||
],
|
||||
"dev": true,
|
||||
"license": "Apache-2.0",
|
||||
"optional": true,
|
||||
"os": [
|
||||
"darwin"
|
||||
],
|
||||
"engines": {
|
||||
"node": ">=16"
|
||||
}
|
||||
},
|
||||
"node_modules/miniflare/node_modules/@cloudflare/workerd-linux-64": {
|
||||
"version": "1.20250906.0",
|
||||
"resolved": "https://registry.npmjs.org/@cloudflare/workerd-linux-64/-/workerd-linux-64-1.20250906.0.tgz",
|
||||
"integrity": "sha512-rlKzWgsLnlQ5Nt9W69YBJKcmTmZbOGu0edUsenXPmc6wzULUxoQpi7ZE9k3TfTonJx4WoQsQlzCUamRYFsX+0Q==",
|
||||
"cpu": [
|
||||
"x64"
|
||||
],
|
||||
"dev": true,
|
||||
"license": "Apache-2.0",
|
||||
"optional": true,
|
||||
"os": [
|
||||
"linux"
|
||||
],
|
||||
"engines": {
|
||||
"node": ">=16"
|
||||
}
|
||||
},
|
||||
"node_modules/miniflare/node_modules/@cloudflare/workerd-linux-arm64": {
|
||||
"version": "1.20250906.0",
|
||||
"resolved": "https://registry.npmjs.org/@cloudflare/workerd-linux-arm64/-/workerd-linux-arm64-1.20250906.0.tgz",
|
||||
"integrity": "sha512-DdedhiQ+SeLzpg7BpcLrIPEZ33QKioJQ1wvL4X7nuLzEB9rWzS37NNNahQzc1+44rhG4fyiHbXBPOeox4B9XVA==",
|
||||
"cpu": [
|
||||
"arm64"
|
||||
],
|
||||
"dev": true,
|
||||
"license": "Apache-2.0",
|
||||
"optional": true,
|
||||
"os": [
|
||||
"linux"
|
||||
],
|
||||
"engines": {
|
||||
"node": ">=16"
|
||||
}
|
||||
},
|
||||
"node_modules/miniflare/node_modules/@cloudflare/workerd-windows-64": {
|
||||
"version": "1.20250906.0",
|
||||
"resolved": "https://registry.npmjs.org/@cloudflare/workerd-windows-64/-/workerd-windows-64-1.20250906.0.tgz",
|
||||
"integrity": "sha512-Q8Qjfs8jGVILnZL6vUpQ90q/8MTCYaGR3d1LGxZMBqte8Vr7xF3KFHPEy7tFs0j0mMjnqCYzlofmPNY+9ZaDRg==",
|
||||
"cpu": [
|
||||
"x64"
|
||||
],
|
||||
"dev": true,
|
||||
"license": "Apache-2.0",
|
||||
"optional": true,
|
||||
"os": [
|
||||
"win32"
|
||||
],
|
||||
"engines": {
|
||||
"node": ">=16"
|
||||
}
|
||||
},
|
||||
"node_modules/miniflare/node_modules/workerd": {
|
||||
"version": "1.20250906.0",
|
||||
"resolved": "https://registry.npmjs.org/workerd/-/workerd-1.20250906.0.tgz",
|
||||
"integrity": "sha512-ryVyEaqXPPsr/AxccRmYZZmDAkfQVjhfRqrNTlEeN8aftBk6Ca1u7/VqmfOayjCXrA+O547TauebU+J3IpvFXw==",
|
||||
"dev": true,
|
||||
"hasInstallScript": true,
|
||||
"license": "Apache-2.0",
|
||||
"bin": {
|
||||
"workerd": "bin/workerd"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=16"
|
||||
},
|
||||
"optionalDependencies": {
|
||||
"@cloudflare/workerd-darwin-64": "1.20250906.0",
|
||||
"@cloudflare/workerd-darwin-arm64": "1.20250906.0",
|
||||
"@cloudflare/workerd-linux-64": "1.20250906.0",
|
||||
"@cloudflare/workerd-linux-arm64": "1.20250906.0",
|
||||
"@cloudflare/workerd-windows-64": "1.20250906.0"
|
||||
}
|
||||
},
|
||||
"node_modules/miniflare/node_modules/zod": {
|
||||
"version": "3.22.3",
|
||||
"resolved": "https://registry.npmjs.org/zod/-/zod-3.22.3.tgz",
|
||||
@@ -2573,6 +2791,22 @@
|
||||
"node": "^10 || ^12 || >=14"
|
||||
}
|
||||
},
|
||||
"node_modules/prettier": {
|
||||
"version": "3.6.2",
|
||||
"resolved": "https://registry.npmjs.org/prettier/-/prettier-3.6.2.tgz",
|
||||
"integrity": "sha512-I7AIg5boAr5R0FFtJ6rCfD+LFsWHp81dolrFD8S79U9tb8Az2nGrJncnMSnys+bpQJfRUzqs9hnA81OAA3hCuQ==",
|
||||
"dev": true,
|
||||
"license": "MIT",
|
||||
"bin": {
|
||||
"prettier": "bin/prettier.cjs"
|
||||
},
|
||||
"engines": {
|
||||
"node": ">=14"
|
||||
},
|
||||
"funding": {
|
||||
"url": "https://github.com/prettier/prettier?sponsor=1"
|
||||
}
|
||||
},
|
||||
"node_modules/rollup": {
|
||||
"version": "4.52.0",
|
||||
"resolved": "https://registry.npmjs.org/rollup/-/rollup-4.52.0.tgz",
|
||||
@@ -3049,12 +3283,14 @@
|
||||
}
|
||||
},
|
||||
"node_modules/workerd": {
|
||||
"version": "1.20250906.0",
|
||||
"resolved": "https://registry.npmjs.org/workerd/-/workerd-1.20250906.0.tgz",
|
||||
"integrity": "sha512-ryVyEaqXPPsr/AxccRmYZZmDAkfQVjhfRqrNTlEeN8aftBk6Ca1u7/VqmfOayjCXrA+O547TauebU+J3IpvFXw==",
|
||||
"version": "1.20250920.0",
|
||||
"resolved": "https://registry.npmjs.org/workerd/-/workerd-1.20250920.0.tgz",
|
||||
"integrity": "sha512-jo/9cRmeYQ8NM0x9yEt0O35NDvb3MPFsxWwhljtlA9sIOeXSEXDC7P82QcgDoBNNKHlyEcJdNI9OL/mxNnkWLw==",
|
||||
"dev": true,
|
||||
"hasInstallScript": true,
|
||||
"license": "Apache-2.0",
|
||||
"optional": true,
|
||||
"peer": true,
|
||||
"bin": {
|
||||
"workerd": "bin/workerd"
|
||||
},
|
||||
@@ -3062,11 +3298,11 @@
|
||||
"node": ">=16"
|
||||
},
|
||||
"optionalDependencies": {
|
||||
"@cloudflare/workerd-darwin-64": "1.20250906.0",
|
||||
"@cloudflare/workerd-darwin-arm64": "1.20250906.0",
|
||||
"@cloudflare/workerd-linux-64": "1.20250906.0",
|
||||
"@cloudflare/workerd-linux-arm64": "1.20250906.0",
|
||||
"@cloudflare/workerd-windows-64": "1.20250906.0"
|
||||
"@cloudflare/workerd-darwin-64": "1.20250920.0",
|
||||
"@cloudflare/workerd-darwin-arm64": "1.20250920.0",
|
||||
"@cloudflare/workerd-linux-64": "1.20250920.0",
|
||||
"@cloudflare/workerd-linux-arm64": "1.20250920.0",
|
||||
"@cloudflare/workerd-windows-64": "1.20250920.0"
|
||||
}
|
||||
},
|
||||
"node_modules/wrangler": {
|
||||
|
@@ -7,12 +7,15 @@
|
||||
"dev": "wrangler dev",
|
||||
"start": "wrangler dev",
|
||||
"test": "vitest",
|
||||
"cf-typegen": "wrangler types"
|
||||
"cf-typegen": "wrangler types",
|
||||
"format": "prettier --write .",
|
||||
"format:check": "prettier --check ."
|
||||
},
|
||||
"devDependencies": {
|
||||
"@cloudflare/vitest-pool-workers": "^0.8.19",
|
||||
"prettier": "^3.6.2",
|
||||
"typescript": "^5.5.2",
|
||||
"vitest": "~3.2.0",
|
||||
"wrangler": "^4.38.0"
|
||||
}
|
||||
}
|
||||
}
|
||||
|
241
src/cleaner.ts
Normal file
241
src/cleaner.ts
Normal file
@@ -0,0 +1,241 @@
|
||||
import type { ClearURLsRules } from "./types";
|
||||
|
||||
export async function cleanUrl(inputUrl: string, rules: ClearURLsRules, maxRedirects = 5, visited = new Set<string>()): Promise<string> {
|
||||
try {
|
||||
const currentUrl = new URL(inputUrl);
|
||||
|
||||
if (visited.has(currentUrl.href) || visited.size >= maxRedirects) {
|
||||
return cleanUrlParameters(currentUrl, rules.providers);
|
||||
}
|
||||
|
||||
visited.add(currentUrl.href);
|
||||
|
||||
// Check for ClearURLs redirections first
|
||||
const clearUrlsRedirect = checkClearUrlsRedirections(currentUrl.href, rules.providers);
|
||||
if (clearUrlsRedirect && clearUrlsRedirect !== currentUrl.href) {
|
||||
return await cleanUrl(clearUrlsRedirect, rules, maxRedirects, visited);
|
||||
}
|
||||
|
||||
// Then check for HTTP redirects
|
||||
const redirectTarget = await followRedirect(currentUrl.href);
|
||||
if (redirectTarget && redirectTarget !== currentUrl.href) {
|
||||
return await cleanUrl(redirectTarget, rules, maxRedirects, visited);
|
||||
}
|
||||
|
||||
return cleanUrlParameters(currentUrl, rules.providers);
|
||||
} catch (error) {
|
||||
console.error(`Error caught when trying to clean url ${inputUrl}`, error);
|
||||
return inputUrl;
|
||||
}
|
||||
}
|
||||
|
||||
async function followRedirect(url: string): Promise<string | null> {
|
||||
// @ts-ignore - Skip redirect following in tests to avoid external HTTP calls
|
||||
if (typeof global !== "undefined" && global.process?.env?.NODE_ENV === "test") {
|
||||
return null;
|
||||
}
|
||||
|
||||
try {
|
||||
const response = await fetch(url, {
|
||||
method: "HEAD",
|
||||
redirect: "manual",
|
||||
});
|
||||
|
||||
if (response.status >= 300 && response.status < 400) {
|
||||
const location = response.headers.get("Location");
|
||||
if (location) {
|
||||
// Handle relative redirects
|
||||
if (location.startsWith("/")) {
|
||||
const baseUrl = new URL(url);
|
||||
return `${baseUrl.protocol}//${baseUrl.host}${location}`;
|
||||
}
|
||||
return location;
|
||||
}
|
||||
}
|
||||
|
||||
return null;
|
||||
} catch (error) {
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
function cleanUrlParameters(url: URL, providers: ClearURLsRules["providers"]): string {
|
||||
const matchingProvider = findMatchingProvider(url.href, providers);
|
||||
|
||||
if (!matchingProvider) {
|
||||
return url.href;
|
||||
}
|
||||
|
||||
if (matchingProvider.completeProvider === true) {
|
||||
throw new Error("URL blocked by ClearURLs rules");
|
||||
}
|
||||
|
||||
if (matchingProvider.exceptions && isException(url.href, matchingProvider.exceptions)) {
|
||||
return url.href;
|
||||
}
|
||||
|
||||
if (matchingProvider.rules && matchingProvider.rules.length > 0) {
|
||||
url = cleanParametersByRules(url, matchingProvider.rules);
|
||||
url = cleanFragmentsByRules(url, matchingProvider.rules);
|
||||
}
|
||||
|
||||
if (matchingProvider.rawRules && matchingProvider.rawRules.length > 0) {
|
||||
let cleaned = url.href;
|
||||
for (const rawRule of matchingProvider.rawRules) {
|
||||
try {
|
||||
cleaned = cleaned.replace(new RegExp(rawRule, "gi"), "");
|
||||
} catch (error) {
|
||||
console.warn(`Invalid raw rule regex: ${rawRule}`, error);
|
||||
}
|
||||
}
|
||||
try {
|
||||
url = new URL(cleaned);
|
||||
} catch (error) {
|
||||
console.warn("Raw rule produced invalid URL, skipping", error);
|
||||
}
|
||||
}
|
||||
|
||||
return url.href;
|
||||
}
|
||||
|
||||
function findMatchingProvider(url: string, providers: ClearURLsRules["providers"]) {
|
||||
const { globalRules, ...otherProviders } = providers;
|
||||
|
||||
for (const [providerName, provider] of Object.entries(otherProviders)) {
|
||||
try {
|
||||
const regex = new RegExp(provider.urlPattern);
|
||||
if (regex.test(url)) {
|
||||
return provider;
|
||||
}
|
||||
} catch (error) {
|
||||
console.warn(`Invalid URL pattern for provider ${providerName}: ${provider.urlPattern}`, error);
|
||||
}
|
||||
}
|
||||
|
||||
return globalRules;
|
||||
}
|
||||
|
||||
function cleanParametersByRules(url: URL, rules: string[]) {
|
||||
const params = new URLSearchParams(url.search);
|
||||
const cleanParams = new URLSearchParams();
|
||||
|
||||
for (const [key, value] of params) {
|
||||
let shouldRemove = false;
|
||||
|
||||
for (const rule of rules) {
|
||||
try {
|
||||
if (new RegExp("^" + rule + "$", "gi").test(key)) {
|
||||
shouldRemove = true;
|
||||
break;
|
||||
}
|
||||
} catch (error) {
|
||||
console.warn(`Invalid rule regex: ${rule}`, error);
|
||||
}
|
||||
}
|
||||
|
||||
if (!shouldRemove) {
|
||||
cleanParams.set(key, value);
|
||||
}
|
||||
}
|
||||
|
||||
url.search = cleanParams.toString();
|
||||
return url;
|
||||
}
|
||||
|
||||
function cleanFragmentsByRules(url: URL, rules: string[]) {
|
||||
const fragments = extractFragments(url);
|
||||
const cleanFragments = new Map<string, string | null>();
|
||||
|
||||
for (const [key, value] of fragments) {
|
||||
let shouldRemove = false;
|
||||
|
||||
for (const rule of rules) {
|
||||
try {
|
||||
if (new RegExp("^" + rule + "$", "gi").test(key)) {
|
||||
shouldRemove = true;
|
||||
break;
|
||||
}
|
||||
} catch (error) {
|
||||
console.warn(`Invalid rule regex: ${rule}`, error);
|
||||
}
|
||||
}
|
||||
|
||||
if (!shouldRemove) {
|
||||
cleanFragments.set(key, value);
|
||||
}
|
||||
}
|
||||
|
||||
url.hash = fragmentsToString(cleanFragments);
|
||||
return url;
|
||||
}
|
||||
|
||||
function extractFragments(url: URL): Map<string, string | null> {
|
||||
const fragments = new Map<string, string | null>();
|
||||
const hash = url.hash.slice(1); // Remove the #
|
||||
|
||||
if (!hash) return fragments;
|
||||
|
||||
const params = hash.split("&");
|
||||
for (const p of params) {
|
||||
const param = p.split("=");
|
||||
if (!param[0]) continue;
|
||||
|
||||
const key = param[0];
|
||||
let value: string | null = null;
|
||||
if (param.length === 2 && param[1]) {
|
||||
value = param[1];
|
||||
}
|
||||
fragments.set(key, value);
|
||||
}
|
||||
|
||||
return fragments;
|
||||
}
|
||||
|
||||
function fragmentsToString(fragments: Map<string, string | null>): string {
|
||||
const parts: string[] = [];
|
||||
for (const [key, value] of fragments) {
|
||||
if (value !== null) {
|
||||
parts.push(key + "=" + value);
|
||||
} else {
|
||||
parts.push(key);
|
||||
}
|
||||
}
|
||||
return parts.length > 0 ? parts.join("&") : "";
|
||||
}
|
||||
|
||||
function isException(url: string, exceptions: string[]): boolean {
|
||||
for (const exception of exceptions) {
|
||||
try {
|
||||
const regex = new RegExp(exception, "i");
|
||||
if (regex.test(url)) {
|
||||
return true;
|
||||
}
|
||||
} catch (error) {
|
||||
console.warn(`Invalid exception regex: ${exception}`, error);
|
||||
}
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
function checkClearUrlsRedirections(url: string, providers: ClearURLsRules["providers"]): string | null {
|
||||
const matchingProvider = findMatchingProvider(url, providers);
|
||||
|
||||
if (!matchingProvider || !matchingProvider.redirections) {
|
||||
return null;
|
||||
}
|
||||
|
||||
for (const redirectionPattern of matchingProvider.redirections) {
|
||||
try {
|
||||
const regex = new RegExp(redirectionPattern, "i");
|
||||
const match = url.match(regex);
|
||||
if (match && match[1]) {
|
||||
// First capture group is the target URL
|
||||
return decodeURIComponent(match[1]);
|
||||
}
|
||||
} catch (error) {
|
||||
console.warn(`Invalid redirection regex: ${redirectionPattern}`, error);
|
||||
}
|
||||
}
|
||||
|
||||
return null;
|
||||
}
|
46
src/index.ts
46
src/index.ts
@@ -1,18 +1,36 @@
|
||||
/**
|
||||
* Welcome to Cloudflare Workers! This is your first worker.
|
||||
*
|
||||
* - Run `npm run dev` in your terminal to start a development server
|
||||
* - Open a browser tab at http://localhost:8787/ to see your worker in action
|
||||
* - Run `npm run deploy` to publish your worker
|
||||
*
|
||||
* Bind resources to your worker in `wrangler.jsonc`. After adding bindings, a type definition for the
|
||||
* `Env` object can be regenerated with `npm run cf-typegen`.
|
||||
*
|
||||
* Learn more at https://developers.cloudflare.com/workers/
|
||||
*/
|
||||
import { cleanUrl } from "./cleaner";
|
||||
import { RulesCache } from "./rules-cache";
|
||||
|
||||
type Env = {
|
||||
RULES_CACHE: DurableObjectNamespace<RulesCache>;
|
||||
};
|
||||
|
||||
export { RulesCache };
|
||||
|
||||
export default {
|
||||
async fetch(request, env, ctx): Promise<Response> {
|
||||
return new Response('Hello World!');
|
||||
async fetch(request, env, _ctx): Promise<Response> {
|
||||
const url = new URL(request.url);
|
||||
const targetUrl = url.searchParams.get("url");
|
||||
|
||||
if (!targetUrl) {
|
||||
return new Response("Missing url parameter", { status: 400 });
|
||||
}
|
||||
|
||||
try {
|
||||
const rulesStub = env.RULES_CACHE.getByName("rules");
|
||||
const rules = await rulesStub.getRules();
|
||||
|
||||
const cleanedUrl = await cleanUrl(targetUrl, rules);
|
||||
return new Response(cleanedUrl, {
|
||||
headers: {
|
||||
"Content-Type": "text/plain",
|
||||
"Access-Control-Allow-Origin": "*",
|
||||
},
|
||||
});
|
||||
} catch (error) {
|
||||
return new Response(`Error processing URL: ${error instanceof Error ? error.message : "Unknown error"}`, {
|
||||
status: 500,
|
||||
});
|
||||
}
|
||||
},
|
||||
} satisfies ExportedHandler<Env>;
|
||||
|
77
src/rules-cache.ts
Normal file
77
src/rules-cache.ts
Normal file
@@ -0,0 +1,77 @@
|
||||
import { DurableObject } from "cloudflare:workers";
|
||||
import type { ClearURLsRules } from "./types";
|
||||
|
||||
const RULES_URL = "https://rules2.clearurls.xyz/data.minify.json";
|
||||
const HASH_URL = "https://rules2.clearurls.xyz/rules.minify.hash";
|
||||
const CACHE_DURATION_MS = 7 * 24 * 60 * 60 * 1000; // 7 days
|
||||
|
||||
type CachedRules = {
|
||||
data: ClearURLsRules;
|
||||
hash: string;
|
||||
cachedAt: number;
|
||||
expiresAt: number;
|
||||
};
|
||||
|
||||
export class RulesCache extends DurableObject {
|
||||
async getRules() {
|
||||
try {
|
||||
const cached = await this.ctx.storage.get<CachedRules>("rules");
|
||||
|
||||
if (cached && Date.now() < cached.expiresAt) {
|
||||
return cached.data;
|
||||
}
|
||||
|
||||
console.log("Fetching fresh rules from ClearURLs");
|
||||
return await this.fetchAndCacheRules();
|
||||
} catch (error) {
|
||||
console.error("Error getting rules:", error);
|
||||
|
||||
// Try to return cached rules even if expired as fallback
|
||||
const cached = await this.ctx.storage.get<CachedRules>("rules");
|
||||
if (cached) {
|
||||
console.log("Falling back to expired cached rules");
|
||||
return cached.data;
|
||||
}
|
||||
|
||||
throw new Error("Failed to get rules and no cached fallback available");
|
||||
}
|
||||
}
|
||||
|
||||
private async fetchAndCacheRules() {
|
||||
const [rulesResponse, hashResponse] = await Promise.all([fetch(RULES_URL), fetch(HASH_URL)]);
|
||||
if (!rulesResponse.ok) {
|
||||
throw new Error(`Failed to fetch rules: ${rulesResponse.status}`);
|
||||
}
|
||||
if (!hashResponse.ok) {
|
||||
throw new Error(`Failed to fetch hash: ${hashResponse.status}`);
|
||||
}
|
||||
|
||||
const [rulesText, expectedHash] = await Promise.all([rulesResponse.text(), hashResponse.text()]);
|
||||
const actualHash = await this.calculateSHA256(rulesText);
|
||||
if (actualHash !== expectedHash.trim()) {
|
||||
throw new Error(`Hash validation failed. Expected: ${expectedHash.trim()}, Actual: ${actualHash}`);
|
||||
}
|
||||
|
||||
const rules = JSON.parse(rulesText) as ClearURLsRules;
|
||||
const now = Date.now();
|
||||
const cachedRules: CachedRules = {
|
||||
data: rules,
|
||||
hash: actualHash,
|
||||
cachedAt: now,
|
||||
expiresAt: now + CACHE_DURATION_MS,
|
||||
};
|
||||
|
||||
await this.ctx.storage.put("rules", cachedRules);
|
||||
console.log(`Cached rules with hash: ${actualHash}`);
|
||||
|
||||
return rules;
|
||||
}
|
||||
|
||||
private async calculateSHA256(text: string) {
|
||||
const encoder = new TextEncoder();
|
||||
const data = encoder.encode(text);
|
||||
const hashBuffer = await crypto.subtle.digest("SHA-256", data);
|
||||
const hashArray = Array.from(new Uint8Array(hashBuffer));
|
||||
return hashArray.map((b) => b.toString(16).padStart(2, "0")).join("");
|
||||
}
|
||||
}
|
14
src/types.ts
Normal file
14
src/types.ts
Normal file
@@ -0,0 +1,14 @@
|
||||
export type ClearURLsProvider = {
|
||||
urlPattern: string;
|
||||
completeProvider?: boolean;
|
||||
rules?: string[];
|
||||
rawRules?: string[];
|
||||
referralMarketing?: string[];
|
||||
exceptions?: string[];
|
||||
redirections?: string[];
|
||||
forceRedirection?: boolean;
|
||||
};
|
||||
|
||||
export type ClearURLsRules = {
|
||||
providers: Record<string, ClearURLsProvider>;
|
||||
};
|
2
test/env.d.ts
vendored
2
test/env.d.ts
vendored
@@ -1,3 +1,3 @@
|
||||
declare module 'cloudflare:test' {
|
||||
declare module "cloudflare:test" {
|
||||
interface ProvidedEnv extends Env {}
|
||||
}
|
||||
|
@@ -1,24 +1,160 @@
|
||||
import { env, createExecutionContext, waitOnExecutionContext, SELF } from 'cloudflare:test';
|
||||
import { describe, it, expect } from 'vitest';
|
||||
import worker from '../src/index';
|
||||
import { env, createExecutionContext, waitOnExecutionContext, SELF } from "cloudflare:test";
|
||||
import { describe, it, expect, beforeAll } from "vitest";
|
||||
|
||||
import worker, { RulesCache } from "../src/index";
|
||||
import type { ClearURLsRules } from "../src/types";
|
||||
|
||||
// For now, you'll need to do something like this to get a correctly-typed
|
||||
// `Request` to pass to `worker.fetch()`.
|
||||
const IncomingRequest = Request<unknown, IncomingRequestCfProperties>;
|
||||
|
||||
describe('Hello World worker', () => {
|
||||
it('responds with Hello World! (unit style)', async () => {
|
||||
const request = new IncomingRequest('http://example.com');
|
||||
// Create an empty context to pass to `worker.fetch()`.
|
||||
// Sampled ClearURLs rules for testing
|
||||
const mockRules: ClearURLsRules = {
|
||||
providers: {
|
||||
Google: {
|
||||
urlPattern: "^https?:\\/\\/(?:[a-z0-9-]+\\.)*?google(?:\\.[a-z]{2,}){1,}",
|
||||
completeProvider: false,
|
||||
rules: ["ved", "ei", "source", "gs_lcp", "aqs", "sourceid", "uact", "rlz", "sclient", "client"],
|
||||
rawRules: [],
|
||||
referralMarketing: [],
|
||||
exceptions: [],
|
||||
redirections: [],
|
||||
forceRedirection: false,
|
||||
},
|
||||
YouTube: {
|
||||
urlPattern: "^https?:\\/\\/(?:[a-z0-9-]+\\.)*?(youtube\\.com|youtu\\.be)",
|
||||
completeProvider: false,
|
||||
rules: ["feature", "gclid", "si", "pp", "ab_channel"],
|
||||
rawRules: [],
|
||||
referralMarketing: [],
|
||||
exceptions: [],
|
||||
redirections: [],
|
||||
forceRedirection: false,
|
||||
},
|
||||
Amazon: {
|
||||
urlPattern: "^https?:\\/\\/(?:[a-z0-9-]+\\.)*?amazon(?:\\.[a-z]{2,}){1,}",
|
||||
completeProvider: false,
|
||||
rules: ["qid", "sr", "ref_", "keywords", "sprefix", "tag", "linkCode", "camp", "creative", "creativeASIN", "psc"],
|
||||
rawRules: [],
|
||||
referralMarketing: [],
|
||||
exceptions: [],
|
||||
redirections: [],
|
||||
forceRedirection: false,
|
||||
},
|
||||
TikTok: {
|
||||
urlPattern: "^https?:\\/\\/(?:[a-z0-9-]+\\.)*?tiktok\\.com",
|
||||
completeProvider: false,
|
||||
rules: ["u_code", "_d", "_t", "timestamp", "share_app_name", "_r", "checksum", "language"],
|
||||
rawRules: [],
|
||||
referralMarketing: [],
|
||||
exceptions: [],
|
||||
redirections: [],
|
||||
forceRedirection: false,
|
||||
},
|
||||
globalRules: {
|
||||
urlPattern: ".*",
|
||||
completeProvider: false,
|
||||
rules: [
|
||||
"utm_source",
|
||||
"utm_medium",
|
||||
"utm_campaign",
|
||||
"utm_term",
|
||||
"utm_content",
|
||||
"mtm_campaign",
|
||||
"mtm_kwd",
|
||||
"ga_source",
|
||||
"ga_medium",
|
||||
"ga_term",
|
||||
"ga_content",
|
||||
"ga_campaign",
|
||||
"yclid",
|
||||
"_openstat",
|
||||
"fbclid",
|
||||
"gclid",
|
||||
"msclkid",
|
||||
],
|
||||
rawRules: [],
|
||||
referralMarketing: [],
|
||||
exceptions: [],
|
||||
redirections: [],
|
||||
forceRedirection: false,
|
||||
},
|
||||
},
|
||||
};
|
||||
|
||||
beforeAll(() => {
|
||||
const mockStub = {
|
||||
getRules: () => Promise.resolve(mockRules),
|
||||
} as DurableObjectStub<RulesCache>;
|
||||
|
||||
if (env.RULES_CACHE) {
|
||||
env.RULES_CACHE.getByName = () => mockStub;
|
||||
}
|
||||
});
|
||||
|
||||
describe("URL Cleaner worker", () => {
|
||||
it("cleans global tracking parameters", async () => {
|
||||
const testUrl = "https://example.com?utm_source=test&utm_medium=email&normal=keep";
|
||||
const request = new IncomingRequest(`http://example.com/?url=${encodeURIComponent(testUrl)}`);
|
||||
const ctx = createExecutionContext();
|
||||
const response = await worker.fetch(request, env, ctx);
|
||||
// Wait for all `Promise`s passed to `ctx.waitUntil()` to settle before running test assertions
|
||||
await waitOnExecutionContext(ctx);
|
||||
expect(await response.text()).toMatchInlineSnapshot(`"Hello World!"`);
|
||||
|
||||
const cleanedUrl = await response.text();
|
||||
expect(cleanedUrl).toBe("https://example.com/?normal=keep");
|
||||
});
|
||||
|
||||
it('responds with Hello World! (integration style)', async () => {
|
||||
const response = await SELF.fetch('https://example.com');
|
||||
expect(await response.text()).toMatchInlineSnapshot(`"Hello World!"`);
|
||||
it("cleans YouTube tracking parameters", async () => {
|
||||
const testUrl = "https://youtube.com/watch?v=abc123&feature=share&si=trackingid&t=30";
|
||||
const response = await SELF.fetch(`https://example.com/?url=${encodeURIComponent(testUrl)}`);
|
||||
const cleanedUrl = await response.text();
|
||||
expect(cleanedUrl).toBe("https://youtube.com/watch?v=abc123&t=30");
|
||||
});
|
||||
|
||||
it("cleans Amazon tracking parameters", async () => {
|
||||
const testUrl = "https://amazon.com/product?keywords=test&ref_=test&tag=mytag&normal=keep";
|
||||
const response = await SELF.fetch(`https://example.com/?url=${encodeURIComponent(testUrl)}`);
|
||||
const cleanedUrl = await response.text();
|
||||
expect(cleanedUrl).toBe("https://amazon.com/product?normal=keep");
|
||||
});
|
||||
|
||||
it("cleans Google tracking parameters", async () => {
|
||||
const testUrl = "https://google.com/search?q=test&ved=123&ei=456&tbm=isch";
|
||||
const response = await SELF.fetch(`https://example.com/?url=${encodeURIComponent(testUrl)}`);
|
||||
const cleanedUrl = await response.text();
|
||||
expect(cleanedUrl).toBe("https://google.com/search?q=test&tbm=isch");
|
||||
});
|
||||
|
||||
it("cleans TikTok tracking parameters specifically", async () => {
|
||||
const testUrl = "https://tiktok.com/video?_t=tracking&_r=more&u_code=123&normal=keep&other=stay";
|
||||
const response = await SELF.fetch(`https://example.com/?url=${encodeURIComponent(testUrl)}`);
|
||||
const cleanedUrl = await response.text();
|
||||
expect(cleanedUrl).toBe("https://tiktok.com/video?normal=keep&other=stay");
|
||||
});
|
||||
|
||||
it("handles unknown domains gracefully", async () => {
|
||||
const testUrl = "https://unknown-site.com?page=1&sort=name&utm_source=test&fbclid=spam";
|
||||
const response = await SELF.fetch(`https://example.com/?url=${encodeURIComponent(testUrl)}`);
|
||||
const cleanedUrl = await response.text();
|
||||
expect(cleanedUrl).toBe("https://unknown-site.com/?page=1&sort=name");
|
||||
});
|
||||
|
||||
it("returns error for missing URL parameter", async () => {
|
||||
const response = await SELF.fetch("https://example.com/");
|
||||
expect(response.status).toBe(400);
|
||||
expect(await response.text()).toBe("Missing url parameter");
|
||||
});
|
||||
|
||||
it("cleans URL fragments (hash parameters)", async () => {
|
||||
const testUrl = "https://example.com/page?normal=keep&utm_source=test#utm_campaign=fragment&other=stay";
|
||||
const response = await SELF.fetch(`https://example.com/?url=${encodeURIComponent(testUrl)}`);
|
||||
const cleanedUrl = await response.text();
|
||||
expect(cleanedUrl).toBe("https://example.com/page?normal=keep#other=stay");
|
||||
});
|
||||
|
||||
it("handles invalid URLs gracefully", async () => {
|
||||
const response = await SELF.fetch("https://example.com/?url=not-a-valid-url");
|
||||
expect(response.status).toBe(200); // Should return original URL, not error
|
||||
expect(await response.text()).toBe("not-a-valid-url");
|
||||
});
|
||||
});
|
||||
|
@@ -36,9 +36,7 @@
|
||||
|
||||
/* Skip type checking all .d.ts files. */
|
||||
"skipLibCheck": true,
|
||||
"types": [
|
||||
"./worker-configuration.d.ts"
|
||||
]
|
||||
"types": ["./worker-configuration.d.ts"]
|
||||
},
|
||||
"exclude": ["test"],
|
||||
"include": ["worker-configuration.d.ts", "src/**/*.ts"]
|
||||
|
@@ -1,10 +1,10 @@
|
||||
import { defineWorkersConfig } from '@cloudflare/vitest-pool-workers/config';
|
||||
import { defineWorkersConfig } from "@cloudflare/vitest-pool-workers/config";
|
||||
|
||||
export default defineWorkersConfig({
|
||||
test: {
|
||||
poolOptions: {
|
||||
workers: {
|
||||
wrangler: { configPath: './wrangler.jsonc' },
|
||||
wrangler: { configPath: "./wrangler.jsonc" },
|
||||
},
|
||||
},
|
||||
},
|
||||
|
13348
worker-configuration.d.ts
vendored
13348
worker-configuration.d.ts
vendored
File diff suppressed because it is too large
Load Diff
@@ -7,9 +7,23 @@
|
||||
"name": "url-cleaner",
|
||||
"main": "src/index.ts",
|
||||
"compatibility_date": "2025-09-20",
|
||||
"migrations": [
|
||||
{
|
||||
"new_sqlite_classes": ["RulesCache"],
|
||||
"tag": "v1",
|
||||
},
|
||||
],
|
||||
"durable_objects": {
|
||||
"bindings": [
|
||||
{
|
||||
"class_name": "RulesCache",
|
||||
"name": "RULES_CACHE",
|
||||
},
|
||||
],
|
||||
},
|
||||
"observability": {
|
||||
"enabled": true
|
||||
}
|
||||
"enabled": true,
|
||||
},
|
||||
/**
|
||||
* Smart Placement
|
||||
* Docs: https://developers.cloudflare.com/workers/configuration/smart-placement/#smart-placement
|
||||
@@ -40,4 +54,4 @@
|
||||
* https://developers.cloudflare.com/workers/wrangler/configuration/#service-bindings
|
||||
*/
|
||||
// "services": [{ "binding": "MY_SERVICE", "service": "my-service" }]
|
||||
}
|
||||
}
|
||||
|
Reference in New Issue
Block a user