-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🔒 Made names for uploaded files more secure and resilient to filesystems' limits #22055
Conversation
WalkthroughThe pull request introduces modifications to file handling and storage mechanisms across multiple files in the Ghost project. The primary changes involve enhancing the flexibility of file saving by adding a Changes
Possibly related PRs
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
ref https://linear.app/ghost/issue/ENG-1260 ref https://linear.app/ghost/issue/ENG-1859 - names for uploaded files now include a 16-character alphanumeric hash, that gets re-generated on every upload. This prevents original images to be found, when e.g. they have been edited in the editor - if the filename length is higher than what most unix filesystem accept (e.g. 255 bytes), the name gets truncated. More precisely, we truncate the stem of the filename while keeping the hash, the file extension and eventually a suffix (e.g. "_o") intact - Example: - original file name: "long-name-for-an-image-imagine-255bytes-followed-by-original-suffix_o.png" - saved file name: "long-name-for-an-image-1a2b3c4d5e6f78_o.png"
5fae4eb
to
2b1cbcf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
ghost/core/test/e2e-api/admin/files.test.js (1)
32-32
: LGTM! Consider adding more test cases.The updated test correctly validates the new secure filename pattern. However, consider adding test cases for:
- Files with very long names to verify truncation
- Files with special characters
- Files with different extensions
ghost/core/core/server/api/endpoints/images.js (1)
74-75
: LGTM! Consider adding documentation.The
keepOriginalName
flag correctly preserves the original filename while maintaining security through the hash suffix. Consider adding a comment explaining why we preserve the original name for unprocessed images.name: imageTransform.generateOriginalImageName(processedImageName), -keepOriginalName: true // Don't generate a new filename on save +// We preserve the original filename (with hash suffix) to maintain a clear link +// between processed and original images for debugging and auditing purposes +keepOriginalName: true
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
yarn.lock
is excluded by!**/yarn.lock
,!**/*.lock
📒 Files selected for processing (6)
ghost/core/core/server/adapters/storage/LocalStorageBase.js
(2 hunks)ghost/core/core/server/api/endpoints/images.js
(1 hunks)ghost/core/package.json
(1 hunks)ghost/core/test/e2e-api/admin/files.test.js
(1 hunks)ghost/core/test/unit/server/adapters/storage/LocalImagesStorage.test.js
(5 hunks)ghost/oembed-service/lib/OEmbedService.js
(2 hunks)
🔇 Additional comments (9)
ghost/core/test/unit/server/adapters/storage/LocalImagesStorage.test.js (3)
50-52
: LGTM! Test assertions correctly updated for secure filenames.The updated assertions properly validate the new filename pattern with the 16-character hash while maintaining readability.
Also applies to: 55-58, 61-64, 67-70
123-141
: LGTM! Good test coverage for file extensions.The new test cases thoroughly cover:
- Valid extensions (.zip, .jpeg)
- Invalid extensions (.0)
- Hash generation for each case
8-8
: LGTM! Good switch to Node's assert module.Using Node's built-in
assert
module improves maintainability and reduces dependencies.ghost/core/core/server/adapters/storage/LocalStorageBase.js (3)
46-49
: LGTM! Well-documented parameters.The JSDoc clearly describes the new parameters and their purposes.
68-69
: LGTM! Proper error handling and path normalization.The implementation:
- Creates directories as needed
- Safely copies files
- Properly normalizes paths for URLs
Also applies to: 84-84
54-65
: LGTM! Secure and flexible filename generation.The implementation:
- Safely handles original filenames when needed
- Generates secure random hashes
- Prevents path traversal
- Handles filesystem limits
Let's verify the security of the filename generation:
✅ Verification successful
Filename generation implementation is secure and well-designed
The implementation uses multiple layers of security:
- Prevents directory traversal through path.join() usage
- Generates secure random filenames by default (16-char hash)
- Properly handles filesystem limits
- Safely handles original filenames when explicitly requested
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for potential path traversal vulnerabilities in filename handling # Test: Search for path normalization and validation rg -A 3 "path\.(?:normalize|resolve|join)" # Test: Search for directory traversal prevention rg -A 3 "\.\./"Length of output: 66291
Script:
#!/bin/bash # Search for getUniqueSecureFilePath implementation and related security functions ast-grep --pattern 'getUniqueSecureFilePath($$$) { $$$ }' # Search for path validation and sanitization rg -A 5 "(?i)sanitize.*path|validate.*path|clean.*path" # Search for directory traversal prevention patterns rg -A 5 "(?i)(\.\.\/|directory.*traversal|path.*traversal)"Length of output: 123524
Script:
#!/bin/bash # Search for directory traversal prevention patterns rg -A 5 "\.\./" rg -A 5 "directory.*traversal" rg -A 5 "path.*traversal" # Search for getUniqueSecureFilePath implementation rg -A 10 "getUniqueSecureFilePath" # Search for path validation and sanitization rg -A 5 "sanitize(Path|Filename|File)" rg -A 5 "validate(Path|Filename|File)"Length of output: 71114
Script:
#!/bin/bash # Search LocalStorageBase.js for security-related code rg -A 10 "class LocalStorageBase|getUniqueSecureFilePath|sanitize|validate|secure" "ghost/core/core/server/adapters/storage/LocalStorageBase.js" # Search for path validation patterns rg -A 5 "checkSafe|validatePath|ensureSafe|safePath" "ghost/core/core/server/adapters/storage/LocalStorageBase.js" # Search for crypto/hash generation code rg -A 5 "crypto|hash|random" "ghost/core/core/server/adapters/storage/LocalStorageBase.js"Length of output: 1842
ghost/oembed-service/lib/OEmbedService.js (2)
137-143
: LGTM! Improved error handling for image fetching.The added error check for the response status enhances robustness and provides better error reporting.
165-165
: Verify the storage adapter API compatibility.The change from
getSanitizedFileName
tosanitizeFileName
aligns with the PR objectives for secure file naming.Let's verify the method exists in the updated ghost-storage-base package:
Also applies to: 167-167
ghost/core/package.json (1)
192-192
: Verify the ghost-storage-base package update.The update from 1.0.0 to 1.1.1 should include the new file naming functionality.
Let's check the package's changelog for any breaking changes:
✅ Verification successful
✓ Ghost Storage Base update is safe to proceed
The update includes the expected new file naming functionality with improved security features while maintaining backward compatibility through deprecated methods. The version bump from 1.0.0 to 1.1.1 is appropriate for these non-breaking changes.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check ghost-storage-base changelog for version 1.1.1 # Fetch package information from npm curl -s https://registry.npmjs.org/ghost-storage-base | jq '.versions["1.1.1"].time, .versions["1.1.1"].description' # Check for any GitHub releases gh api repos/TryGhost/Ghost-Storage-Base/releases/tags/v1.1.1 --jq '.body'Length of output: 380
Script:
#!/bin/bash # Description: Check ghost-storage-base repository for changelog and version history # Clone the repository to check for changelog git clone --depth 1 https://github.com/TryGhost/Ghost-Storage-Base.git temp-storage-base cd temp-storage-base # Check for changelog file echo "=== Checking for CHANGELOG.md ===" cat CHANGELOG.md 2>/dev/null || echo "No CHANGELOG.md found" # Get commit history between versions echo -e "\n=== Commit history between v1.0.0 and v1.1.1 ===" git fetch --tags git log --oneline v1.0.0..v1.1.1 2>/dev/null || echo "Could not find version tags" # Check package.json diff between versions if possible echo -e "\n=== Package.json changes between versions ===" git diff v1.0.0..v1.1.1 package.json 2>/dev/null || echo "Could not diff package.json"Length of output: 1412
Script:
#!/bin/bash # Description: Check ghost-storage-base changes since v1.0.0 cd temp-storage-base # Get commit history from v1.0.0 to main echo "=== Commits from v1.0.0 to main ===" git log --oneline v1.0.0..main # Check for code changes in core files echo -e "\n=== Code changes in core files since v1.0.0 ===" git diff v1.0.0..main -- '*.js' ':!test/*'Length of output: 7452
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
ghost/core/test/e2e-api/admin/images.test.js (1)
322-322
: Consider adding more edge cases to filename tests.While the current tests cover basic scenarios, consider adding tests for:
- International characters in filenames
- Special characters that might need escaping
- Maximum hash collision scenarios
Example test case:
+ it('Handles international characters in filenames correctly', async function () { + const originalFilePath = p.join(__dirname, '/../../utils/fixtures/images/ghost-logo.png'); + await uploadImageCheck({ + path: originalFilePath, + filename: 'テスト画像.png', + contentType: 'image/png' + }); + });Also applies to: 332-332, 337-337
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
ghost/core/test/e2e-api/admin/images.test.js
(5 hunks)
🔇 Additional comments (4)
ghost/core/test/e2e-api/admin/images.test.js (4)
16-16
: LGTM: Crypto module import aligns with PR objectives.The addition of the crypto module is appropriate for generating secure random hashes for file names.
57-60
: LGTM: Function signature simplification.Removing the
expectedOriginalFileName
parameter simplifies the interface while maintaining functionality.
252-258
: LGTM: Robust filename length handling test.The test case correctly verifies that:
- Long filenames are truncated
- The hash and extension are preserved
- The total length stays under 253 bytes
61-65
: Verify the regex pattern's security implications.The new file name pattern uses a 16-character hash, but we should verify that the regex doesn't allow any potentially unsafe characters.
✅ Verification successful
The regex pattern is secure and properly validates the file name structure
The pattern
${parsedFileName.name}-\\w{16}${parsedFileName.ext}
is used only for testing and matches the secure implementation where:
- Original file names are parsed safely using Node's path.parse()
- A 16-character random hash is generated using crypto
- Only alphanumeric characters and underscores are allowed in the hash
- File extensions are preserved but validated
- SVG files undergo sanitization to prevent XSS
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Analyze the file name pattern construction and usage # Test: Check for any file name sanitization in the codebase # Look for file name sanitization logic rg -A 5 'sanitize.*filename|filename.*sanitize' # Check for any file name validation patterns ast-grep --pattern 'new RegExp($$$)'Length of output: 68767
Script:
#!/bin/bash # Search for file name hash/ID generation and path handling rg -A 5 "generateFileName|generateId|generateHash" ghost/core/ # Look for file name processing rg -A 5 "parsedFileName|path\.parse" ghost/core/ # Check for path traversal prevention rg -A 5 "sanitize.*path|normalize.*path|resolve.*path" ghost/core/Length of output: 67544
Script:
#!/bin/bash # Look for image upload handling and file name generation rg -A 10 "uploadImage|handleImage|processImage" ghost/core/ # Search for file name validation/sanitization in storage adapters rg -A 5 "getTargetDir|getUniqueFileName|getSafeFileName" ghost/core/Length of output: 68946
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
ghost/core/test/e2e-api/admin/images.test.js (1)
309-311
: Consider extracting magic numbers into constants.The values
253
and8
(for hash bytes) appear to be important constants. Consider extracting them into named constants at the module level for better maintainability.+const MAX_FILENAME_BYTES = 253; +const HASH_BYTES = 8; + const ext = '.png'; -const hash = `-${crypto.randomBytes(8).toString('hex')}`; -const truncatedNameLength = 253 - hash.length - ext.length; +const hash = `-${crypto.randomBytes(HASH_BYTES).toString('hex')}`; +const truncatedNameLength = MAX_FILENAME_BYTES - hash.length - ext.length;ghost/core/core/server/services/themes/storage.js (1)
Line range hint
49-78
: Consider additional improvements for theme file handling.While the current changes align the code with the new file naming strategy, consider these improvements:
- The backup name generation uses
ObjectID()
- consider using the same secure naming strategy as other files for consistency- The error messages might need updates to reflect new file naming patterns
- Theme validation might need updates to handle new file name patterns
Example improvement for backup name generation:
- const backupName = `${themeName}_${ObjectID()}`; + const backupName = getStorage().generateSecureFileName(themeName, 'backup');
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
ghost/core/core/server/services/themes/storage.js
(3 hunks)ghost/core/test/e2e-api/admin/images.test.js
(4 hunks)ghost/core/test/e2e-api/admin/media.test.js
(9 hunks)
🔇 Additional comments (8)
ghost/core/test/e2e-api/admin/media.test.js (3)
41-42
: LGTM! Secure filename pattern implementation.The new regex pattern correctly validates the 16-character hash addition to filenames, enhancing security by making filenames unpredictable.
57-57
: Consider adding test coverage for original file paths.The changes to
uploadImageCheck
and the handling of original file paths look good, but consider adding test cases that specifically verify the original file paths maintain the same hash as their processed counterparts.Also applies to: 73-77
204-205
: LGTM! Consistent thumbnail naming pattern.The thumbnail URL generation correctly maintains consistency with the parent media file's hash pattern.
ghost/core/test/e2e-api/admin/images.test.js (3)
307-314
: LGTM! Comprehensive filename length handling.The test effectively verifies that:
- Long filenames are truncated to stay under 253 bytes
- The hash and extension are preserved during truncation
- The truncation logic maintains filename uniqueness
319-321
: LGTM! Duplicate filename handling.The test correctly verifies that multiple files with the same name are handled properly by using the hash for uniqueness.
331-331
: LGTM! Edge case handling for filename suffixes.The tests properly verify the handling of special suffixes (_o) in filenames, ensuring they're either trimmed or preserved based on their position.
Also applies to: 336-336
ghost/core/core/server/services/themes/storage.js (2)
77-78
: Document why themes preserve original names.While the PR aims to make file names more secure by obscuring them, themes explicitly preserve original names. This exception should be documented to explain why themes require different handling.
Consider:
- Adding a comment explaining why themes preserve original names
- Documenting this exception in the architecture documentation
- Verifying that this doesn't create security issues specific to theme handling
✅ Verification successful
Preserving original theme names is a valid architectural decision
Keeping original theme names is necessary for proper theme management and activation. The codebase shows that theme names are integral identifiers used throughout the system for validation, caching, and activation processes. Security is handled separately through theme validation (gscan) and activation checks, making name preservation safe in this context.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check for potential security implications of keeping original theme names # Look for any security-related comments or documentation about theme names echo "Checking for existing security documentation about theme names:" rg -B 2 -A 2 "theme.*name.*security|security.*theme.*name" --type md # Check if theme names are validated elsewhere echo -e "\nChecking for theme name validation:" rg "theme.*name.*valid|valid.*theme.*name" --type jsLength of output: 1837
Script:
#!/bin/bash # Check theme validation and security measures # Look for theme validation implementation echo "Theme validation implementation:" ast-grep --pattern 'function validateTheme($_) { $$$ }' # Check theme name format requirements echo -e "\nTheme name format requirements:" rg -B 2 -A 2 "theme.*name.*format|format.*theme.*name" # Look for theme-related security measures echo -e "\nTheme security measures:" rg -B 2 -A 2 "validateTheme|themeValidation" --type jsLength of output: 390
Script:
#!/bin/bash # Search for theme validation and name handling # Find theme-related files echo "Theme-related files:" fd "theme" --type f --extension js # Look for theme validation in validate.js echo -e "\nTheme validation code:" rg -B 2 -A 2 "class|function" ghost/core/core/server/services/themes/validate.js || true # Check theme installation/loading code echo -e "\nTheme installation/loading:" rg -B 2 -A 2 "loadTheme|installTheme|themeLoader" --type jsLength of output: 9519
49-49
: Verify the storage adapter API change.The change from
getSanitizedFileName
tosanitizeFileName
needs verification to ensure the new method exists and is implemented correctly in the storage adapter.
Deployed to staging with ID: |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #22055 +/- ##
==========================================
+ Coverage 74.01% 74.03% +0.01%
==========================================
Files 1272 1272
Lines 75840 76008 +168
Branches 10102 10130 +28
==========================================
+ Hits 56130 56269 +139
- Misses 18766 18791 +25
- Partials 944 948 +4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
ref https://linear.app/ghost/issue/ENG-1260
ref https://linear.app/ghost/issue/ENG-1859
long-name-for-an-image-imagine-255bytes-here-blablablablabla-and-a-bit-more.png
long-name-for-an-image-1a2b3c4d5e6f7g890.png