Skip to main content
Skip to main content

FinOps Program

Audience: Cloud Engineering, Platform Engineering, and IT Finance

Purpose: Centralized resources for cloud cost management and optimization


ResourceDescription
CloudabilityFinOps platform for cost visibility
FinOps FoundationIndustry framework and best practices
AWS Cost ManagementNative AWS cost tools
Azure Cost ManagementNative Azure cost tools

What is FinOps?

FinOps (Financial Operations) is an operational framework and cultural practice that brings financial accountability to cloud spending. It combines systems, best practices, and culture to increase an organization's ability to understand cloud costs and make data-driven decisions.


Program Resources

Orientation Materials

Getting Started

New team members should review the FinOps orientation materials to understand cloud cost management principles and TAMU-specific practices.

Orientation Topics
  1. FinOps Framework Overview

    • The three phases: Inform, Optimize, Operate
    • Key personas and responsibilities
    • Maturity model and progression
  2. TAMU Cloud Environment

    • Multi-cloud strategy (AWS, Azure, GCP)
    • Account structure and governance
    • Tagging standards for cost allocation
  3. Tooling and Reporting

    • Cloudability dashboard navigation
    • Native cloud cost tools
    • Custom reports and alerts
  4. Optimization Strategies

    • Reserved instances and savings plans
    • Right-sizing recommendations
    • Waste elimination tactics

Standard Operating Procedures

SOPDescription
Monthly Cost ReviewProcess for reviewing and analyzing monthly cloud costs
Reserved Instance ManagementPurchasing and managing RIs across platforms
Cost Anomaly InvestigationSteps for investigating unexpected cost spikes
Chargeback ReportingGenerating cost allocation reports for units

Cost Visibility

Tagging Standards

Proper tagging is essential for accurate cost allocation. All cloud resources should include the following tags:

Tag KeyDescriptionExample
CostCenterFinancial cost center code12345
EnvironmentDeployment environmentproduction, development, test
OwnerTeam or individual responsiblecloud-engineering
ProjectProject or application namestudent-portal
DataClassificationData sensitivity levelpublic, confidential

Reporting Cadence

ReportFrequencyAudience
Daily Spend SummaryDailyCloud Engineering
Weekly Optimization ReportWeeklyPlatform Engineering
Monthly Cost AnalysisMonthlyIT Leadership
Quarterly Business ReviewQuarterlyExecutive Leadership

Optimization Strategies

Reserved Instances & Savings Plans

Reserved capacity purchases provide significant discounts (up to 72%) for predictable workloads:

AWS

  • Reserved Instances for EC2, RDS, ElastiCache
  • Savings Plans for compute flexibility
  • Analyze usage patterns before purchasing

Azure

  • Reserved VM Instances
  • Reserved capacity for SQL, Cosmos DB
  • Azure Hybrid Benefit for Windows/SQL licensing

GCP

  • Committed Use Discounts (CUDs)
  • Sustained use discounts (automatic)
Right-Sizing Recommendations

Regularly review compute resources to ensure appropriate sizing:

  1. Identify Underutilized Resources

    • CPU utilization < 20% average
    • Memory utilization < 40% average
    • Network throughput minimal
  2. Evaluate Right-Sizing Options

    • Downsize to smaller instance types
    • Consider burstable instances (T-series)
    • Consolidate workloads where appropriate
  3. Implement Changes

    • Schedule changes during maintenance windows
    • Monitor performance after changes
    • Document decisions and outcomes
Waste Elimination

Common sources of cloud waste:

Waste TypeDetectionResolution
Orphaned volumesUnattached EBS/managed disksDelete or snapshot and remove
Idle load balancersNo registered targetsRemove or consolidate
Stale snapshotsOld backups beyond retentionDelete per retention policy
Unused IPsUnattached elastic IPsRelease back to pool
Dev/test running 24/7Non-production always onImplement schedules

Governance & Policies

Budget Alerts

Configure budget alerts to monitor spending:

  • 50% threshold — Informational notification
  • 80% threshold — Warning to stakeholders
  • 100% threshold — Alert to leadership and action required
  • Forecasted 100% — Proactive warning based on trend

Cost Anomaly Detection

Automated anomaly detection helps identify unexpected spending:

  1. Enable native anomaly detection (AWS, Azure)
  2. Configure Cloudability alerts for custom thresholds
  3. Establish investigation workflow for anomalies
  4. Document root cause and remediation