Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

formatting question - KeepLatestVersionNFilesInFolder #83

Open
t-readyroc opened this issue Oct 19, 2022 · 14 comments · May be fixed by #84
Open

formatting question - KeepLatestVersionNFilesInFolder #83

t-readyroc opened this issue Oct 19, 2022 · 14 comments · May be fixed by #84
Labels
bug Something isn't working help wanted Extra attention is needed question Further information is requested

Comments

@t-readyroc
Copy link

I'm wondering if I may not be able to use this tool due to our versioning scheme. Each repo has a "Packages" subfolder that I'm including, but then also has 30+ different file names in each. I've written the equivalent number of rules for each repo, e.g.:

        - rule: KeepLatestVersionNFilesInFolder
          count: 2
          custom_regexp: "bootstrap-gui-([\d]+\.[\d]+\-[\d]+).*\.rpm"
        - rule: KeepLatestVersionNFilesInFolder
          count: 2
          custom_regexp: "bootstrap-influxdb-([\d]+\.[\d]+\-[\d]+).*\.rpm"
        - rule: KeepLatestVersionNFilesInFolder
          count: 2
          custom_regexp: "bootstrap-ingest-([\d]+\.[\d]+\-[\d]+).*\.rpm"
        - rule: KeepLatestVersionNFilesInFolder
          count: 2

etc & so forth, to cover each of the different file names. However, when I attempt to run the cleanup, the script gives me the error:

ValueError: invalid literal for int() with base 10: '6-4'

I have tested each individual regex, to make sure that they do match the version numbers for each file name. I'm pretty sure I'm just not formatting it correctly for what the script expects. I'm guessing that maybe the rule separates the digits by dots? Which won't work for us, since our last digit is separated by a dash?

Any guidance would be appreciated.

@allburov
Copy link
Member

Hi!

Have you tried to escape \ with \\?
The example from the readme.md shows it as:

- rule: KeepLatestVersionNFilesInFolder
  count: 1
  custom_regexp: "[^\\d][\\._]((\\d+\\.)+\\d+)"

Could you give few examples for the filenames? I need to reproduce it on my side, don't get the filename format from the regexp

@allburov
Copy link
Member

Do you use RPM repository or it's just usual file one with rpm files in it?
Asking because Artifactory set some additional properties to the files in RPM repository and it can help us in the future research if above solution doesn't work.

@allburov allburov added the question Further information is requested label Oct 20, 2022
@t-readyroc
Copy link
Author

Sorry, yes, I ran into the double escape issue, & have since updated the regexp:

        - rule: KeepLatestVersionNFilesInFolder
          count: 2
          custom_regexp: "bootstrap-gui-([\\d]+\\.[\\d]+\\-[\\d]+).*\\.rpm"
        - rule: KeepLatestVersionNFilesInFolder
          count: 2
          custom_regexp: "bootstrap-influxdb-([\\d]+\\.[\\d]+\\-[\\d]+).*\\.rpm"
        - rule: KeepLatestVersionNFilesInFolder
          count: 2
          custom_regexp: "bootstrap-ingest-([\\d]+\\.[\\d]+\\-[\\d]+).*\\.rpm"

The repositories are "package type: RPM," & examples of the files above are:

bootstrap-gui-3.10-33.src.rpm
bootstrap-ingest-3.10-92.src.rpm
bootstrap-influxdb-3.7-3.noarch.rpm

As you can see, the revision is delineated by a -, which is what, I think is causing the issue. I found yesterday where it looks like the rule is splitting on .

allburov added a commit that referenced this issue Oct 21, 2022
@allburov allburov linked a pull request Oct 21, 2022 that will close this issue
@allburov
Copy link
Member

allburov commented Oct 21, 2022

@t-readyroc thank you for pointing out the exact line!

I've pushed the fix for it here #84
Could you install from the branch and test that it works for your case? If it does - I'll release changes

pip uninstall artifactory-cleanup
pip install git+https://github.com/devopshq/artifactory-cleanup.git@keep-latest-n-files-no-dot
artifactory-cleanup --version
# Must show 1.0.4

Important note - above rules don't work in that way

If you use "bootstrap-ingest-([\\d]+\\.[\\d]+\\-[\\d]+).*\\.rpm" regexp - the rule will keep all artifacts that doesn't match the rule (just to avoid accidental removes for new artifacts)

In order to achieve what you want - you have to split it to different policies

  policies:
    - name: reponame - bootstrap gui
      rules:
        - rule: Repo
          name: "reponame"
        - rule: KeepLatestVersionNFilesInFolder
          count: 2
          custom_regexp: "bootstrap-gui-([\\d]+\\.[\\d]+\\-[\\d]+).*\\.rpm"
    - name: reponame - bootstrap ingest
      rules:
        - rule: Repo
          name: "reponame"
        - rule: KeepLatestVersionNFilesInFolder
          count: 2
          custom_regexp: "bootstrap-ingest-([\\d]+\\.[\\d]+\\-[\\d]+).*\\.rpm"

@allburov allburov added the bug Something isn't working label Oct 21, 2022
@t-readyroc
Copy link
Author

Hey @allburov - running into an odd thing. Getting the following error:

Traceback (most recent call last):
  File "/home/myuser/.pyenv/versions/ans2.10/bin/artifactory-cleanup", line 33, in <module>
    sys.exit(load_entry_point('artifactory-cleanup==1.0.4', 'console_scripts', 'artifactory-cleanup')())
  File "/home/myuser/.pyenv/versions/3.8.13/envs/ans2.10/lib/python3.8/site-packages/plumbum/cli/application.py", line 177, in __new__
    return cls.run()
  File "/home/myuser/.pyenv/versions/3.8.13/envs/ans2.10/lib/python3.8/site-packages/plumbum/cli/application.py", line 634, in run
    retcode = inst.main(*tailargs)
  File "/home/myuser/.pyenv/versions/3.8.13/envs/ans2.10/lib/python3.8/site-packages/artifactory_cleanup/cli.py", line 121, in main
    for summary in cleanup.cleanup(
  File "/home/myuser/.pyenv/versions/3.8.13/envs/ans2.10/lib/python3.8/site-packages/artifactory_cleanup/artifactorycleanup.py", line 53, in cleanup
    artifacts_to_remove = policy.filter(artifacts)
  File "/home/myuser/.pyenv/versions/3.8.13/envs/ans2.10/lib/python3.8/site-packages/artifactory_cleanup/rules/base.py", line 284, in filter
    artifacts = rule.filter(artifacts)
  File "/home/myuser/.pyenv/versions/3.8.13/envs/ans2.10/lib/python3.8/site-packages/artifactory_cleanup/rules/keep.py", line 176, in filter
    artifactory_with_version.sort()
TypeError: '<' not supported between instances of 'dict' and 'dict'

Running 1.0.4:

artifactory-cleanup --version
artifactory-cleanup 1.0.4

Also: thanks for the heads-up on the rules formatting. I'm now formatting them as separate policies, like so:

- name: bootstrap-repo-* - Keep 2 most recent versions - bootstrap-gui
  rules:
    - rule: RepoByMask
      mask: "bootstrap-repo-*"
    - rule: IncludePath
      masks: "Packages"
    - rule: KeepLatestVersionNFilesInFolder
      count: 2
      custom_regexp: "bootstrap-gui-([\\d]+\\.[\\d]+\\-[\\d]+).*\\.rpm"
- name: bootstrap-repo-* - Keep 2 most recent versions - bootstrap-ingest
  rules:
    - rule: RepoByMask
      mask: "bootstrap-repo-*"
    - rule: IncludePath
      masks: "Packages"
    - rule: KeepLatestVersionNFilesInFolder
      count: 2
      custom_regexp: "bootstrap-ingest-([\\d]+\\.[\\d]+\\-[\\d]+).*\\.rpm"

@allburov
Copy link
Member

allburov commented Oct 27, 2022

@t-readyroc I've pushed new version to the same branch
If it doesn't work - could you "print" (or just set a breakpoint on that line) the artifactory_with_version content and send it back please?

Make sure to uninstall it before install the updated version

pip uninstall artifactory-cleanup
pip install git+https://github.com/devopshq/artifactory-cleanup.git@keep-latest-n-files-no-dot
artifactory-cleanup --version
# Must show 1.0.4

@t-readyroc
Copy link
Author

@allburov - looks like your last update fixed the error! I think what's currently in your branch is going to work for us 😄

@t-readyroc
Copy link
Author

I may have spoken too soon - but I think we may be back to a question of rule formatting/use. At the moment, it looks like The following rules are deleting all artifacts from the specified repos. Have I messed up the rule formatting again?

A pair of example rules (one for .noarch.rpm, one for .src.rpm):

- name: REPO-* - Keep 2 most recent versions - bootstrap-ingest.noarch
  rules:                                                                    
    - rule: RepoByMask                                                      
      mask: "REPO-*"                                                 
    - rule: IncludePath                                                     
      masks: "Packages"                                                     
    - rule: KeepLatestVersionNFilesInFolder                                 
      count: 2                                                              
      custom_regexp: "bootstrap-ingest-([\\d]+\\.[\\d]+\\-[\\d]+)\\.noarch\\.rpm"
- name: REPO-* - Keep 2 most recent versions - bootstrap-ingest.src  
  rules:                                                                    
    - rule: RepoByMask                                                      
      mask: "REPO-*"                                                 
    - rule: IncludePath                                                     
      masks: "Packages"                                                     
    - rule: KeepLatestVersionNFilesInFolder                                 
      count: 2                                                              
      custom_regexp: "bootstrap-ingest-([\\d]+\\.[\\d]+\\-[\\d]+)\\.src\\.rpm"

I've just run a test output (no -destroy) to check the files to be deleted, and I've gone through 2 of the 8 release repos. It's looking like the above rules will delete every artifact in the repo - leaving none.

@t-readyroc
Copy link
Author

Example output from the application on one of the above rules. 3.10-90 & 3.10-92 are the most recent 2 versions:

DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.10-83.noarch.rpm' - 309M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.10-84.noarch.rpm' - 310M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.10-85.noarch.rpm' - 310M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.10-86.noarch.rpm' - 311M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.10-87.noarch.rpm' - 311M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-10.noarch.rpm' - 181M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-11.noarch.rpm' - 182M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-1.noarch.rpm' - 182M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-20.noarch.rpm' - 191M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-21.noarch.rpm' - 199M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-22.noarch.rpm' - 200M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-24.noarch.rpm' - 200M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-25.noarch.rpm' - 200M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-26.noarch.rpm' - 200M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-28.noarch.rpm' - 200M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-29.noarch.rpm' - 200M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-2.noarch.rpm' - 182M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-30.noarch.rpm' - 200M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-31.noarch.rpm' - 200M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-32.noarch.rpm' - 200M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-33.noarch.rpm' - 200M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-34.noarch.rpm' - 262M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-35.noarch.rpm' - 262M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-36.noarch.rpm' - 200M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-39.noarch.rpm' - 272M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-40.noarch.rpm' - 272M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-41.noarch.rpm' - 272M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-42.noarch.rpm' - 273M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-48.noarch.rpm' - 280M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-49.noarch.rpm' - 275M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-50.noarch.rpm' - 275M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-7.noarch.rpm' - 181M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.8-9.noarch.rpm' - 180M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.9-168.noarch.rpm' - 304M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.9-169.noarch.rpm' - 309M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.9-170.noarch.rpm' - 310M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.9-171.noarch.rpm' - 308M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.9-178.noarch.rpm' - 314M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.9-179.noarch.rpm' - 313M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.9-1.noarch.rpm' - 200M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.10-90.noarch.rpm' - 310M 
DEBUG - we would delete 'REPO-RELEASE-3.10/Packages/bootstrap-ingest-3.10-92.noarch.rpm' - 311M 

@t-readyroc t-readyroc changed the title formattign question - KeepLatestVersionNFilesInFolder formatting question - KeepLatestVersionNFilesInFolder Oct 28, 2022
@allburov allburov added the help wanted Extra attention is needed label Nov 16, 2022
@t-readyroc
Copy link
Author

Any ideas on this?

@Keonik1
Copy link

Keonik1 commented Jan 17, 2023

@allburov have you considered comparing versions via packaging? In my other project, he worked out correctly with a dash ("-"), but I just don't have enough experience to change something correctly here)

https://stackoverflow.com/questions/11887762/how-do-i-compare-version-numbers-in-python

https://packaging.pypa.io/en/stable/version.html

@allburov
Copy link
Member

have you considered comparing versions via packaging?

Actually, the version pattern depends on repository type a lot - debian's repository can have a bit different version pattern than python one, for instance.
It's the reason why we added regexp.

@Keonik1
Copy link

Keonik1 commented Jan 17, 2023

@allburov Can you explain why such a regular expression does not work in the project?
/test123\/[\d]+.[\d]+\/test123-([\d]+\.[\d]+\.[\d]+-[\d]+)\..+/ (/<regular expression>/ is the same as \\<regular expression>, so the error should not be in this.)
reference: https://regex101.com/r/VuKsoN/1

Gives out such an error? I tried both in the release and in the branch from this issue the same way

Warning: Could not identify version for test123/2.0/test123-2.0.10-854.txt
Filter package test123/2.0/test123-2.0.10-854.txt
Warning: Could not identify version for test123/2.0/test123-2.0.10-854.bin
Filter package test123/2.0/test123-2.0.10-854.bin
Warning: Could not identify version for test123/2.0/test123-2.0.10-863.txt
Filter package test123/2.0/test123-2.0.10-863.txt
Warning: Could not identify version for test123/2.0/test123-2.0.10-863.bin
Filter package test123/2.0/test123-2.0.10-863.bin
Warning: Could not identify version for test123/2.1/test123-2.1.1-944.txt
Filter package test123/2.1/test123-2.1.1-944.txt
Warning: Could not identify version for test123/2.1/test123-2.1.1-944.bin
Filter package test123/2.1/test123-2.1.1-944.bin
Warning: Could not identify version for test123/2.1/test123-2.1.1-945.txt
Filter package test123/2.1/test123-2.1.1-945.txt
Warning: Could not identify version for test123/2.1/test123-2.1.1-945.bin
Filter package test123/2.1/test123-2.1.1-945.bin

UPD:
I have tried different expressions:
/test123\/[\d]+.[\d]+\/test123-([\d]+\.[\d]+\.[\d]+-[\d]+)\..+/
'test123\/[\d]+.[\d]+\/test123-([\d]+\.[\d]+\.[\d]+-[\d]+)\..+'
"test123\\/[\\d]+.[\\d]+\\/test123-([\\d]+\\.[\\d]+\\.[\\d]+-[\\d]+)\\..+"

@arossert
Copy link

arossert commented Apr 2, 2024

I have a similar issue, my version pattern looks like 24.1.3-7698, so the - part of the version cannot be handled as it is splitted by . and the 3-7698 cannot be parsed as int.

Is there a planned fix for this or should I create a custom rule?

BTW, replacing the

artifactory_with_version.sort(
    key=lambda x: [int(x) for x in x[0].split(".")]
)

With

artifactory_with_version.sort(
    key=lambda x: packaging.version.Version(x[0])
)

Should do the trick, I don't mind creating a PR if you are intrested

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working help wanted Extra attention is needed question Further information is requested
Development

Successfully merging a pull request may close this issue.

4 participants