-
Notifications
You must be signed in to change notification settings - Fork 7
syncing dataquest-dev/dtq-dev #31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* Added spacing between clarin & dspace logo * Spacing is prettier and when clarin logo is removed, dspace logo is centered
* Fix of unwanted Seznam Dataset License request while using other licenses
…ace#851) * Added an URL serializer to fix encoding of the special characters from the URL e.g., `[`, `(` because the filename wasn't properly parsed * Added some unit tests for encoding the bitstream filename url
* Do not use the clarin item view box for the bulk access * Removed unused import * Clarify the `showClarinViewBox` is boolean * Use the constant for the hardcoded bulk access list id
* Create Acknowledgment-ReadMe.md Acnkowledgment of NRP project (cherry picked from commit ad889b2) * Video files previews This uses the thumbnail as poster (if available) and correctly sets the source of the video currently only works for anonymously accessible files. (cherry picked from commit 4832c2f) * Handle video previews for restricted items append a shortlived token at the right time (error, seeking, stalled) (cherry picked from commit 2c12d7d) * Display only ORIGINAL bitstreams Thumbnails, when available, should be shown istead of the generic MIME_TYPE_IMAGE. Content of the TEXT bundle should not be shown at all this is usually automatically extracted "text layer" of a PDF, useful for indexing, but don't want people downloading it. (cherry picked from commit 3862442) * fix linter and test errors (cherry picked from commit f852096) * Code review follow up the listOfFiles should really not contain files from "TEXT" or "THUMBNAIL" bundles. * code review unsubscribe error$, seeking$ and $stalled * code review - thumbnail might be undefined * code review - consistent formatting
this contains the changes from lindat-2025.05.15113644568
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Sync various updates from lindat-2025.05 including weekly import, filename‐encoding improvements, bulk access tweaks, Solr config copy changes, and UI/logo layout adjustments.
- Add a custom
UrlSerializerandencodeRFC3986URIComponentutil to properly handle special characters in bitstream URLs - Introduce bulk‐access list ID and clarify view logic for CLARIN vs default list display
- Update Docker Compose to copy Solr configs on each start and schedule a weekly import
Reviewed Changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/app/shared/object-list/object-list.component.ts | Define BULK_ACCESS_LIST_ID and showClarinViewBox() for bulk‐access logic |
| src/app/shared/object-list/object-list.component.html | Swap hardcoded context === 'search' checks for showClarinViewBox() |
| src/app/shared/clarin-shared-util.ts | Add encodeRFC3986URIComponent to safely encode filenames |
| src/app/login-page/login-page.component.html | Align logos, add target="_blank" and rel attrs |
| src/app/item-page/simple/field-components/preview-section/file-description/file-description.component.ts | Implement OnDestroy to manage subscriptions and handle missing thumbnails |
| src/app/item-page/simple/field-components/preview-section/file-description/file-description.component.spec.ts | Minor spec formatting update |
| src/app/item-page/clarin-files-section/clarin-files-section.component.ts | Restrict metadata bitstream query to ORIGINAL only |
| src/app/core/url-serializer/bitstream-url-serializer.ts | New BitstreamUrlSerializer to encode filenames in router URLs |
| src/app/core/url-serializer/bitstream-url-serializer.spec.ts | Tests for filename encoding in /bitstream/ URLs |
| src/app/bitstream-page/legacy-bitstream-url.resolver.ts | Apply encodeRFC3986URIComponent to legacy download filenames |
| src/app/bitstream-page/clarin-zip-download-page/clarin-zip-download-page.component.ts | Conditionally append dtoken query param only if present |
| src/app/bitstream-page/clarin-license-agreement-page/clarin-license-agreement-page.component.ts | Subscribe to item$ and load license content only when appropriate |
| src/app/app.module.ts | Provide custom BitstreamUrlSerializer for the router |
| docker/docker-compose-rest.yml | Remove Solr volume mount; copy configsets at startup |
| .github/workflows/import-weekly.yml | Schedule weekly import job every Sunday |
| .github/disabled-workflows/deploy.yml | Remove legacy import schedule from disabled workflow |
Comments suppressed due to low confidence (4)
src/app/item-page/simple/field-components/preview-section/file-description/file-description.component.ts:38
- [nitpick] Property
handlers_addeduses snake_case; align with TypeScript style by renaming tohandlersAdded.
handlers_added = false;
src/app/shared/clarin-shared-util.ts:96
- This new utility lacks unit tests. Add tests covering spaces, brackets, parentheses, and edge cases to ensure correct encoding.
export function encodeRFC3986URIComponent(uriPart: string) {
src/app/core/url-serializer/bitstream-url-serializer.spec.ts:26
- Consider adding a test for filenames containing multiple
/segments to verify path reconstruction (and catch unintended comma insertion).
it('should encode special characters in the filename in /bitstream/ URLs', () => {
src/app/shared/clarin-shared-util.ts:101
- The implementation only replaces parentheses. To match the docstring and cover brackets (
[]) and spaces, extend the replacement or use a more comprehensive regex/pipeline.
.replace(/[()]/g, c => '%' + c.charCodeAt(0).toString(16).toUpperCase());
| loadLicenseContentSeznam() { | ||
| this.htmlContentService.getHmtlContentByPathAndLocale(this.LICENSE_PATH_SEZNAM_CZ).then(content => { | ||
| this.licenseContentSeznam.next(content); | ||
| this.item$.subscribe((item) => { |
Copilot
AI
May 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Subscribing directly without cleanup can cause memory leaks. Consider using take(1) or unsubscribing in ngOnDestroy, or refactor to async pipe.
| this.item$.subscribe((item) => { | |
| this.item$.pipe(take(1)).subscribe((item) => { |
This contains the changes from lindat-2025.05.15113644568
Run Python import weekly
ZCU-PUB/Fixed encoding of the filename from the URL (dataquest-dev#838) (dataquest-dev#851)
UFAL/Do not mount the Solr configs; copy them each time instead. (dataquest-dev#850)
UFAL/Fix the bulk access (dataquest-dev#852)
UFAL/Do not add
dtokeninto the URL if it is null (dataquest-dev#860) (dataquest-dev#861)UFAL/Added spacing between clarin & dspace logo (dataquest-dev#848)
Ufal/seznam license request (dataquest-dev#844)
Merge remote-tracking branch 'dataquest-dev/dtq-dev' into HEAD
Ufal dtq sync 2025 05 14 (dataquest-dev#855)