Disk quota warnings page missing some info

Hi All,

It looks like the “Individual User Quota” section of " Disk Quota Warnings on Dashboard" here: https://osc.github.io/ood-documentation/master/customization.html#individual-user-quota is missing the example for what a user json would look like.

It states:

If the quota is defined as a user quota, then it applies to only disk resources used by the user alone. This is the default type of quota object and is given in the following format:

but there is no example after, just the next section " Individual Fileset Quota"

Could this be updated?

Thanks!
Morgan

Yes it can! I’ve opened this ticket on github with an example for you in that ticket.

1 Like

We just had a question on this on our today’s local OOD call. I did not realize that OOD is capable of this.

Jeff, how do you go with generating those disk use json files? Do you run a cron job that stores them at a given location for each user? Or do you generate then on demand? Also, how do you get the usage and quota info?

To put you to our perspective - we can’t use the quota command since our quota are done on the file server (our storage guy could tell you what I don’t know), but, we have a script that our webadmin wrote and runs periodically that does some low level file system query that gives us the actual usage per user, and I can then query a database that stores this to pull the usage for a given user.

So, I’d be looking at writing a script that pulls the usage data, and then formats them into the json format that OOD wants. I guess I’d be running this as a cron job too but I am not sure if I want to do this for all our 5000 users every X minutes while only a handful will be on the OOD, or if I could query what users are currently logged in and do it only for them.

Thanks,
MC

@mcuma At OSC we collect quota usage from both GPFS hosting project and scratch filesystems and our NetApp which hosts home directories. We run a cron job on one of our servers with access to GPFS that writes the JSON files to a shared NFS mount that is then mounted via autofs on all our OnDemand nodes. We use the same scripts to write out JSON files capable of being loaded into XDMoD’s storage realm.

We query the NetApp every 10 minutes via ONTAP API and we query GPFS I think every 15 minutes via mmrepquota. Both operations are low impact and fast. Sometimes the NetApp queries take a while as we have thousands of users but it’s rare one cron run will get locked out by a lock file being locked from previous execution.

The way the JSON files work for OnDemand it makes sense to collect quota information for everyone unless you limit access to OnDemand by some means where you could limit the scope of what’s included in the collection and JSON files.

1 Like

Thanks Trey, this sounds doable, I’ll poke our people that know the quota commands and then should be able to get this working.

Hi everyone,

I have an add on question to this.

What’s the difference between block_usage and total_block_usage, and file_usage and total_file_usage?

Thanks,
MC

If you could share your scripts, if they aren’t already, I’d appreciate it. We use GPFS as well and I’d love not having to do that work to provide the quota listing.

Thanks either way.

@mcuma The difference is only when using “type” of “fileset” which we setup for GPFS. At OSC we do a GPFS fileset for projects and set limits on the fileset not on the users but we still can get user usage. The user usage for a given fileset is {block,file}_usage and the usage of the overall fileset is the total_*_usage. For non-fileset quotas where the quota is on the user then total_*_usage will be the usage for that user.

GPFS:

        {
        "block_limit": 0, 
        "block_usage": 6225024, 
        "file_limit": 0, 
        "file_usage": 139, 
        "path": "/fs/project/ibtest", 
        "total_block_usage": 508949920, 
        "total_file_usage": 11287, 
        "type": "fileset", 
        "user": "tdockendorf"
    },

NetApp where the quota is on the user:

        {
        "block_limit": 524288000, 
        "file_limit": 1000000, 
        "path": "/users/sysp", 
        "total_block_usage": 314017124, 
        "total_file_usage": 766592, 
        "user": "tdockendorf"
    },
1 Like

@novosirj The script is here: https://gist.github.com/treydock/a4af28d6928029ac7541a25e16a446f4

The script relies on several libraries from a “osc” python module. The main one is for GPFS here: https://gist.github.com/treydock/8447cf7ea2cc470c0a342bfa51037285

The references to setup_logging and exec_cmd are just helpers around logging module and subprocess. The LDAP queries in the script can be ignored if you don’t need to generate JSON for XDMOD’s storage realm. I’m working to get approvals from folks here to publish our entire Python module which would make understanding the script a bit easier.

The cron jobs are based on symlinks of the script that defines the GPFS filesystem:

/users/reporting/storage/sbin/repquota_gpfs.project.py -t fileset user --cron --histdir /users/reporting/storage/quota/historical/gpfs.project --quotadir /users/reporting/storage/quota --quotafile gpfs.project_quota.txt 2>&1 | logger -t repquota_gpfs.project

Hi Trey,

great, thanks. That clarifies things. It would be great to update the OOD docs this way too.

One tangential question. Our users may have space on different file systems. Everyone has a home, either quoted or not, but, we also have a lot of what we call “group spaces”, file systems that research groups purchase and only their members can write to. They are not quoted, but, they often fill up so it’d be nice to display warnings when they are close to full as well. Can the current setup support more than 1 file system?

Thanks,
MC

I’ve opened issue for the documentation update: https://github.com/OSC/ood-documentation/issues/239

Yes, you can support multiple filesystems. We ingest JSON for 3 unique filesystems at OSC, 2 are GPFS filesystems and one is a NetApp.

Here is part of our /etc/ood/config/apps/dashboard/env

OOD_QUOTA_PATH=/users/reporting/storage/quota/netapp.netapp-home.ten.osc.edu-users_quota.json:/users/reporting/storage/quota/gpfs.project_quota.json:/users/reporting/storage/quota/gpfs.scratch_quota.json

The basic idea is that OOD_QUOTA_PATH is treated like PATH in that multiple values are separated by a colon.

At this time the quota is looked up by username from the JSON so you’d have to populate all users who can access a given group space and just assign them all the overall usage of the filesystem.

I’ve opened https://github.com/OSC/ondemand/issues/412 to possibly address supporting showing quota warnings based on group quotas and group membership.

OK, thanks. What would happen if you’d have multiple file systems info in a single JSON file? That’s what we’re currently working with.

That should work just fine. The lookup is by username, and if you gave each a unique path then the warning message would be meaningful in OnDemand. If multiple quotas exist for a single user, they will be merged to produce multiple warnings if the thresholds are exceeded for any of the defined quotas.

Sounds good, we’ll wrangle some more with the JSON file format over here and report back.
MC

In case it helps, there is a standard format for usage accounting, including storage accounting, used among the XSEDE projects and grid projects. the storage part is based on the earlier = EMI Storage Accounting Record (StAR) used in EGI and as well as the European Open Science Cloud that is suitable for continued adoption here.

https://www.ogf.org/documents/GFD.204.pdf

The advantages of using this are that it is already in use and cloud-ready.

I think we have the JSON file right now but I am still not seeing anything in the dashboard. Can someone (Trey?) please comment on this?

This is what I did:

The pertinent entry for me in the JSON file above is:
{
“total_file_usage”: 13210025,
“file_limit”: 0,
“user”: “u0101881”,
“block_limit”: 4294967296,
“total_block_usage”: 3611816924,
“file_usage”: 4737944,
“path”: “/uufs/chpc.utah.edu/common/home/u0101881”,
“block_usage”: 1266229924,
“type”: “fileset”
},
i.e. there’s ~29% free on this file system, my threshold is 1%. So the quota warning should show up, but I don’t see it anywhere.

Perhaps it may be worth it to add a screenshot with the quota warning in the docs as well so we can see where it is and how it looks like?

Thanks,
MC

That should work as long as you’re logged in as u0101881. For debugging purposes I’d recommend downloading that file and saving to disk on OnDemand and update OOD_QUOTA_PATH to be a local file and see if that changes the behavior.

I’ve attached screenshot of what I see when I log into dev instance of OnDemand with threshold set to 0.05.

Thanks Trey, both the suggestion on local file and the screenshot were very helpful - I found out that I am not able to get the JSON file from that web server. We’re sorting that out now.

That screenshot would be helpful in the docs as well so we can see what to expect, when it works.

One more question - is there a way to change the message on how often are the quotes updated? I think we do it every 15 min. But, in the worst case I’ll ask our guys to do it every 5 min.

Thanks,
MC

The wording for the quota messages are stored in locales for Rails.

Add something like the following to /etc/ood/config/locales/en.yml

en:
 dashboard:
  quota_reload_message: "Reload page to see updated quota. Quotas are updated every 15 minutes."

The default text is here: https://github.com/OSC/ondemand/blob/master/apps/dashboard/config/locales/en.yml#L40

Got it, thanks. The quota info looks nice, hopefully it’ll reduce the number of tickets we get when users are over the quota.