Week 3: Data Visualization II and Problem Solving - Flashcards Flashcards
Master Week 3: Data Visualization II and Problem Solving - Flashcards with these flashcards. Review key terms, definitions, and concepts using active recall to strengthen your understanding and ace your exams.
Swipe to navigate between cards
Front
Matplotlib Architecture
Back
Matplotlib separates the plotting API from the rendering backend so users can create figures independent of output format. It uses objects like Figure and Axes to build plots programmatically and supports multiple backends for display or file output. Understanding this architecture helps customize and troubleshoot complex visualizations.
Front
HoloViz
Back
HoloViz is a set of Python libraries designed to simplify building interactive visualizations and dashboards. It integrates tools like hvPlot, Panel, and Datashader to enable scalable and user-friendly exploration of data. The ecosystem emphasizes easy-to-use APIs and rich interactivity for analysts and developers.
Front
hvPlot
Back
hvPlot provides a high-level plotting API that works directly with pandas, xarray, and similar data structures to produce interactive plots. It offers concise syntax for common chart types while integrating with HoloViz tools for dashboards. hvPlot enables quick exploratory visuals without extensive configuration.
Front
Panel
Back
Panel is a library for building interactive dashboards and apps in Python using simple declarative components. It can combine plots, widgets, and layouts from multiple plotting libraries into sharable web applications. Panel supports reactive workflows and can serve dashboards locally or deploy them to servers.
Front
Datashader
Back
Datashader is a graphics pipeline for rendering large datasets by rasterizing data into pixels rather than drawing each point. It enables visualization of billions of points by aggregating and shading at the display resolution, avoiding overplotting. Datashader is often combined with hvPlot and rasterize options for big-data visualization.
Front
rasterize=True
Back
The rasterize=True option instructs plotting tools to render complex or dense layers as raster images instead of vector primitives. This improves performance and reduces file size for large point clouds or dense overlays. Rasterization trades infinite scalability for speed and practicality in big-data displays.
Front
Seaborn
Back
Seaborn is a statistical data visualization library built on top of Matplotlib, offering high-level functions for attractive default styles and common plot types. It simplifies complex visualizations like violin plots, pairplots, and heatmaps with concise APIs. Seaborn is useful for exploratory data analysis and communicating statistical patterns.
Front
Interactive Dashboard
Back
An interactive dashboard combines plots, widgets, and controls to let users explore data dynamically and filter views in real time. Dashboards support decision-making by exposing key metrics and enabling ad hoc analysis without code. Good dashboard design balances interactivity, performance, and clarity for stakeholders.
Front
Streamz
Back
Streamz is a Python library for building pipelines to process streaming data in real time and integrate with visualization stacks. It can feed continuously updating data into dashboards, enabling live monitoring and reactive analytics. Streamz supports connectors to sources like Kafka and integrates with HoloViz tools for streaming visuals.
Front
CRISP-DM
Back
CRISP-DM is a cross-industry standard process model for data mining that outlines phases like business understanding, data understanding, modeling, evaluation, and deployment. It emphasizes iterative development and aligning analytics with business goals. CRISP-DM helps structure projects and communicate progress with stakeholders.
Front
Unstructured Problems
Back
Unstructured problems lack clear definitions, metrics, or prescribed solutions and are common in real-world business contexts. Solving them requires framing the problem, exploring data context, and iterating on possible analyses and actions. Practitioners must decompose ambiguity into actionable sub-problems and prioritize based on impact and feasibility.
Front
Clarify Step
Back
The Clarify step focuses on defining the business goal, clarifying ambiguous terms, and specifying evaluation metrics for the project. It ensures analysts and stakeholders share a common understanding of what success looks like. Clear definitions prevent wasted effort on misaligned analyses.
Front
Decompose Step
Back
Decompose breaks a broad problem into contextual components, user groups, and smaller sub-questions that are easier to analyze. This step uncovers hidden assumptions and identifies data boundaries and constraints. Decomposition enables targeted solutions and parallel workstreams.
Front
Prioritize Step
Back
Prioritize involves ranking sub-questions and potential analyses based on impact, feasibility, and cost. Using tools like cost-benefit diagrams helps focus limited resources on the most valuable investigations first. Prioritization accelerates progress and increases the chance of delivering actionable insights quickly.
Front
Preliminary Analysis
Back
Preliminary analysis assesses data availability, quality, and initial relationships through data profiling and visualization. It helps determine whether the data can support the proposed analyses and identifies cleaning or additional data needs. Early visualization can reveal patterns, biases, and missingness that shape the project.
Front
Six Ws
Back
The Six Ws (Who, When, Where, Why, What, How) provide a checklist to interrogate a problem and its context systematically. They help surface stakeholder groups, temporal and spatial constraints, root causes, scope limits, and actionable implications. Applying the Six Ws leads to more complete problem framing and solution design.
Front
Sales Per SquareFoot
Back
Sales Per SquareFoot measures a store's revenue generation efficiency relative to its retail space. It is useful for comparing performance across stores of different sizes and guiding real estate or layout decisions. This KPI highlights space utilization and merchandising effectiveness.
Front
Conversion Rate
Back
Conversion Rate is the proportion of visits that result in purchases or other desired actions, indicating how effectively visits turn into outcomes. It helps assess customer funnel performance and the impact of merchandising, promotions, or UX changes. Improving conversion often yields direct revenue gains.
Front
Inventory Turnover
Back
Inventory Turnover shows how frequently inventory is sold and replaced during a period, reflecting supply chain and merchandising efficiency. High turnover can indicate strong demand or understocking risks, while low turnover may signal overstock or poor product-market fit. It informs purchasing and assortment strategies.
Front
HEART Framework
Back
HEART is a user-experience metric framework that stands for Happiness, Engagement, Adoption, Retention, and Task success. It guides product teams to choose measurable indicators aligned with user-centered goals. Combining HEART metrics with behavioral analytics yields actionable UX insights.
Front
Cost-Benefit Diagram
Back
A cost-benefit diagram visually compares the expected costs and benefits of different analytic options to support prioritization. It helps teams weigh resource needs against potential impact and risk, making trade-offs explicit. This tool aids in selecting analyses that maximize value under constraints.
Front
Iteration
Back
Iteration emphasizes repeated cycles of analysis, refinement, and stakeholder feedback throughout data projects. It acknowledges that initial hypotheses and models are provisional and improves outcomes through successive adjustments. Iterative workflows reduce risk and better align results with business needs.
Create your own flashcards
Turn your notes, PDFs, and lectures into flashcards with AI. Study smarter with spaced repetition.
Get Started Free