CI: Disable stack cost retrieval to stop job hang
This is a temporary workaround to address critical job hangs caused
by lftools openstack stack cost command lacking timeout handling.
Problem (IT-28614):
The lftools stack cost command hangs indefinitely when:
- Pricing API (pricing.vexxhost.net) is slow/unresponsive
- OpenStack SDK queries for nested stacks take too long
- Network operations lack timeout parameters
This causes:
- Jobs stuck at 'Retrieving stack cost for: <stack-name>'
- Downstream jobs waiting on checkpoints indefinitely
- Jenkins queue buildup (66+ jobs waiting reported)
- Requires manual intervention to cancel stuck jobs
Solution:
Disable stack cost retrieval by commenting out the lftools call
and hardcode stack cost to 0 in the stack-cost file.
Known Regression:
Stack costs will be reported as 0 in archived cost.csv files,
losing cost tracking data for OpenStack resources. Instance-level
costs are still collected via job-cost.sh pricing API calls.
Root Cause:
lftools/openstack/stack.py cost() function lacks:
- timeout parameter for urllib.request.urlopen() (line 87)
- timeout handling for OpenStack SDK resource enumeration
- graceful degradation on network failures
Next Steps:
1. Implement timeout handling in lftools
2. Once lftools is fixed, revert this workaround
3. Create tracking issue for the regression (stack costs = 0)
This stopgap prioritizes job reliability over cost data collection.
Issue-ID: IT-28614
Change-Id: I9d054347e485f4843c7287aec49fb7aff0962e96
Signed-off-by: Anil Belur <abelur@linuxfoundation.org>