Instagram Collections A/B Testing

Timeframe: 1 Month (May/June 2024)

My Role: UX Researcher

Team: 2 UX Researchers

Introduction

Instagram Collections is a feature allowing users to save and categorize posts into albums, yet many users struggle to efficiently access their saved content. This study investigates whether adding a "Collections" button to the Instagram profile dashboard improves ease of access and user engagement. The research question focuses on whether this UI change will reduce the number of clicks needed to access collections and increase the completion rate for finding saved posts, ultimately encouraging users to save more posts over time.

The method involves an A/B test, with participants randomly assigned to either Test Cell A (current Instagram interface) or Test Cell B (interface with the new "Collections" button). Quantitative data will be collected automatically, including the completion rate for accessing collections, the number of clicks to access collections, and the number of posts added to collections. Additionally, qualitative data will be gathered through virtual think-aloud interviews, heat map tracking, and user satisfaction surveys.

We expect that the new "Collections" button will lead to a higher completion rate for accessing saved collections, fewer clicks needed to navigate to collections, and an increase in the number of posts added to collections. Positive user feedback is also anticipated, highlighting improved navigation and ease of use. The study aims to provide insights into how a simple UI change can significantly enhance user experience, guiding future design decisions for Instagram and similar platforms. The results will offer actionable data on improving accessibility features and understanding user behavior related to saved content.

Hypothesis

If a "Collections" button is added to the Instagram profile dashboard, then users will access their collections with fewer clicks (more efficiently), resulting in an increased task completion rate for finding saved posts. We also expect that users will save more posts over time due to the improved efficiency and accessibility of their collections.

Data Collection Plan

A/B Test Cells:

Test Cell A: The current version of Instagram and user interaction to access collections.

Test Cell B: The current version of Instagram with a new button on the profile page to access collections.

Key Metric: 

The key metric for this experiment is the completion rate for accessing saved collections, defined as the percentage of users who successfully navigate to their saved collections after having saved posts.

Key Metric Rationale: 

Measuring the completion rate for accessing saved collections on Instagram will directly reflect the impact of adding a 'Collections' button to the profile dashboard. This metric provides a clear indicator of whether users find it easier and quicker to access their saved collections with the new button. Other metrics to consider include the number of clicks and the amount of posts added to collections. The number of clicks will help quantify the reduction in navigation steps required to access collections, while the amount of posts added to collections will indicate if improved accessibility encourages users to save more content.

By directly observing the completion rate, we can determine the effectiveness of the 'Collections' button in enhancing user experience. An increase in the completion rate would suggest that users are finding and accessing their collections more easily. Consequently, we can also expect an increase in the frequency of post saving and categorization activities, which can be measured by tracking the number of posts added to collections.

Length of Data Collection: 

The A/B test will run for one month to gather sufficient data on user interactions. Participants will be exposed to either the control or test version of the Instagram interface during their regular use of the app, without knowing which version they are using. This will ensure unbiased results and realistic usage patterns. Additionally, qualitative data collection methods such as interviews and surveys will be used to gain deeper insights into user experiences and satisfaction.

How is the Data Collected

Qualitative and Quantitative Data:

  • Randomization: Participants will be randomly assigned to either the control (Test Cell A) or test (Test Cell B) group. This ensures that each group is comparable and that any differences in outcomes can be attributed to the introduction of the "Collections" button.

  • A/B Testing: Participants will complete the test through a UseBerry link. 

  • Demographics and Instagram Use Virtual Questionnaire: Basic demographic questions such as age, race/ethnicity, and occupation, as well as questions about their use and familiarity with Instagram. This helps in understanding if familiarity with the current Collections feature affects the outcomes.

  • Completion Rate: Automatically track whether participants successfully access their saved collections after saving posts. This is the key metric.

  • Number of Clicks: Automatically record the number of clicks it takes for users to access their saved collections. This helps in understanding the efficiency of the new button.

  • Posts Added to Collections: Track the number of posts users add to their collections, indicating if the improved accessibility encourages more usage.

  • Think-out-loud Interview: Selected participants will be asked to perform tasks related to accessing their collections while describing their actions out loud. This helps in identifying any usability issues and gathering detailed feedback on their experiences.

  • Heat Map Tracking: During the interview, track where the participant is clicking to identify common navigation paths and potential areas of confusion.

  • Satisfaction Virtual Survey: After using the app, participants will complete the ASQ (After-Scenario Questionnaire) and an open-ended questionnaire to provide qualitative feedback on their satisfaction and any difficulties they encountered.

Sample Size & Rationale

Sample Size:

To ensure robust and statistically significant results, the experiment will involve 100 participants, with 50 participants randomly assigned to each test cell (Test Cell A and Test Cell B).

Rationale:

The sample size of 100 participants has been chosen based on several factors:

  • Statistical Power: A larger sample size increases the power of the statistical tests, making it more likely to detect a true effect if one exists. With 50 participants in each group, we aim to achieve a balance between statistical power and practical feasibility.

  • Effect Size Detection: To detect a medium effect size (Cohen's d = 0.5) with a power of 0.8 at a 0.05 significance level, a sample size of approximately 100 participants is appropriate. This ensures that our study is adequately powered to identify meaningful differences between the control and test groups.

  • Variability in User Behavior: Instagram users exhibit a wide range of behaviors and engagement levels. A sample size of 100 participants allows us to capture this variability, providing a more comprehensive understanding of how the new "Collections" button impacts different types of users.

  • Generalization: Including a diverse group of 100 participants helps ensure that our findings are generalizable to the broader Instagram user base. By considering demographic factors such as age, gender, and familiarity with Instagram, we can better understand how different user segments are affected by the UI change.

  • Feasibility: While larger sample sizes can provide more accurate estimates, they also require more resources in terms of time, cost, and effort. A sample size of 100 participants strikes a practical balance, allowing us to conduct the study within a reasonable timeframe and budget while still obtaining reliable results.

In conclusion, a sample size of 100 participants (50 per test cell) is selected to provide sufficient statistical power, detect meaningful differences, account for user variability, ensure generalizability, and maintain practical feasibility. This sample size will enable us to confidently assess the impact of adding a "Collections" button to the Instagram profile dashboard.

Data Analysis Plan

Analysis Method

Quantitative Analysis:

  • Independent T-Test: An independent t-test will be used to compare the key metric (completion rate for accessing saved collections) between Test Cell A and Test Cell B. This test is suitable because it compares the means of two independent groups to determine if there is a statistically significant difference.

    • Null Hypothesis (H0): Adding a "Collections" button to the profile dashboard has no impact on the completion rate for accessing saved collections.

    • Alternative Hypothesis (H1): Adding a "Collections" button to the profile dashboard increases the completion rate for accessing saved collections.

    • Significance Level (Alpha): A standard alpha value of 0.05 will be used to determine statistical significance. If the p-value is less than 0.05, we will reject the null hypothesis in favor of the alternative hypothesis.

  • Mean and Standard Deviation: The mean and standard deviation of the completion rate, number of clicks, and posts added to collections will be calculated for both Test Cell A and Test Cell B to summarize the data.

Qualitative Analysis:

  • Thematic Analysis: Responses from the think-out-loud interviews and satisfaction surveys will be analyzed using thematic analysis. This involves identifying, analyzing, and reporting patterns (themes) within the qualitative data.

  • Heat Map Analysis: Click heat maps generated during the think-out-loud interviews will be analyzed to identify common navigation paths and areas of confusion.

Analysis Method Rationale

  • Type of Data:

    • Completion Rate: This is a numeric, quantitative, comparable, continuous, and actionable metric that provides direct insight into user behavior. It will be summarized in terms of average completion rates for each test cell.

    • Number of Clicks and Posts Added: These secondary metrics will provide additional context to the completion rate, helping to understand efficiency and user engagement.

  • Sample:

    • Independent Samples: One key metric (completion rate) and two secondary metrics (number of clicks and posts added) are being collected for two independent samples (Test Cell A and Test Cell B). This necessitates the use of an independent t-test for comparison.

  • Purpose:

    • he primary purpose of the experiment is to compare and observe the variance in completion rates between the control (Test Cell A) and test (Test Cell B) groups. This will help us understand the impact of the new "Collections" button on user behavior.

    • Secondary metrics and qualitative feedback will provide a more comprehensive understanding of user experience and the practical benefits of the new UI feature.

Expected Outcomes

  • Primary Metric: We expect to see a higher completion rate for accessing saved collections in Test Cell B compared to Test Cell A, indicating improved efficiency and ease of access with the new "Collections" button.

  • Secondary Metrics: A reduction in the number of clicks required to access collections in Test Cell B. An increase in the number of posts added to collections in Test Cell B, suggesting enhanced user engagement.

  • Qualitative Insights: User feedback from interviews and surveys is expected to highlight the usability improvements and potential areas for further enhancement.

By combining quantitative and qualitative analyses, we aim to gain a holistic understanding of how the "Collections" button affects user experience and engagement on Instagram.

Discussion

Several factors could influence the experiment's outcome. Firstly, while our hypothesis focuses on the completion rate for accessing saved collections, it may be beneficial to examine user engagement over a more extended period, beyond the initial experiment, to capture long-term behavioral changes since there might be an adjustment period for users who are familiar with Instagram’s current layout. Furthermore, the number of clicks it takes to reach completion is a crucial metric, as a reduction in navigation steps directly correlates with improved user efficiency and satisfaction. Lastly, while the t-test is appropriate for analyzing the completion rates, further qualitative insights from interviews and open-ended survey responses could provide a deeper understanding of user experiences and potential areas for improvements. Considering these factors will strengthen the findings and ensure a thorough evaluation of the UI change.

The results of the experiment can be interpreted through both quantitative and qualitative lenses. If the completion rate for accessing saved collections is significantly higher in Test Cell B compared to Test Cell A, this would indicate that the addition of the "Collections" button successfully improved the ease of access. A statistically significant difference (p-value < 0.05) would confirm that the observed improvement is unlikely to be due to random chance. Furthermore, a reduction in the number of clicks needed to access collections in Test Cell B would support the conclusion that the new button simplifies the navigation process, providing additional evidence that the button enhances user efficiency. An increase in the number of posts added to collections in Test Cell B would suggest that easier access encourages more frequent use of this feature, indicating a positive change in user behavior. Positive user feedback, such as ease of use, convenience, and improved navigation, would reinforce the quantitative findings, while negative feedback or suggestions for further improvements could highlight areas for additional refinement. Analysis of click patterns and verbal feedback during tasks will provide insights into user navigation strategies and potential pain points, helping to identify any remaining usability issues that need to be addressed.

The design implications based on the findings suggest that the addition of the "Collections" button should be implemented in the standard Instagram interface to improve user experience by providing quicker and more intuitive access to saved collections. If the qualitative data indicates that users are still uncertain about some features, additional tips or tutorials may be needed to educate users about the Collections feature. Any usability issues identified through qualitative feedback and heat map analysis should be addressed in future design iterations, potentially involving tweaks to the button's placement, design, or adding supporting features that further enhance accessibility.

Several factors could affect the success of the experiment. Sample bias could arise if the sample is not representative of the broader Instagram user base, making the results less generalized. Ensuring diverse demographic and usage patterns among participants is imperative. Additionally, participants may not use the app as frequently during the study period, leading to insufficient data for some metrics; extending the study duration or increasing the sample size can help ensure robust data collection. Misinterpretation of metrics could also complicate the interpretation if secondary metrics like the number of clicks or posts added do not align with the key metric. Ensuring a clear understanding and alignment of all metrics with the research goals is essential to avoid this issue. By anticipating these potential problems and addressing them proactively, the reliability and validity of the study's findings can be enhanced.

Our Presentation