Quality Performance in the Refugee Appeal Division 2018-19

Report of results

Prepared by:
Planning, Evaluation and Performance Measurement Directorate

July 2019


(Some content has been remove due to solicitor-client privilege)​

​​​​​Table of contents

  1. 1.0 Background
  2. 2.0 Performance Results
    1. 2.1 Proceedings Respect the Parties
    2. 2.2 Reasons are Complete
    3. 2.3 Reasons are Clear and Concise
    4. 2.4 Decisions are Timely
    5. 2.5 Additional Analysis
    6. 2.6 Analysis of Select Appeals Involving Chairperson's Guideline 9: Proceedings before the IRB Involving Sexual Orientation and Gender Identity and Expression
    7. 2.7 Summary
    8. Appendix A – Performance Indicators and Rating Scale
    9. Appendix B – SOGIE Analytical Framework
  3. 3.0 Evaluation Report of Doug Ewart, LL.B, LL.M
    1. Appendix I
    2. Appendix II – Biography of Doug Ewart, 2018 Quality Assurance Evaluator
  4. Notes

1.0 Background

This report describes the results of the evaluation of quality in decision-making in the Refugee Appeal Division (RAD) in fiscal year 2018-19. The results of this study, reported in aggregate form, support decisions by the Division to improve quality and fulfill reporting requirements under the Policy on Results. This is the second review of quality in the RAD, following the inaugural study completed in 2016-17.

Overall Approach

As with the quality studies of all IRB divisions, this study examines the key indicators of quality that align with the IRB’s overall expected results for decision-making excellence:

  1. Timely and complete pre-proceeding readiness
  2. Respectful proceedings
  3. Focused proceedings
  4. Clear, complete, concise and timely decisions

The study analyzed a representative sample of appeals finalized during April to June 2018 and includes a subsample of appeals involving Chairperson's Guideline 9: Proceedings before the IRB Involving Sexual Orientation and Gender Identity and Expression (SOGIE). The Guideline was released in May 2017, and this study is the first systematic review of how members applied Guideline 9 utilizing an assessment framework piloted by the IRB’s Planning, Evaluation and Performance Measurement Directorate (PEPM) here for the first time.

As with the 2016-17 study, this year PEPM engaged an expert consultant to review the RAD record of each sampled appeal. The consultant, Mr. Doug Ewart, then applied a set of performance indicators, scoring each along a 1-to-3 rating scale, and critiqued each decision.

The consultant’s interpretation and application of each performance indicator is described in Appendix I. As always, the correctness of a member’s decision falls outside the scope of the review.

The report unfolds in two sections. The first section consists of PEPM’s statistical and aggregated findings based on an analysis of quantitative and qualitative data (the consultant’s scores and critiques for approximately 50 performance indicators across 78 appeals). The second section reports the consultant’s specific findings on divisional quality and PEPM’s methodology.

Sample Design and Selection

The study examined a sample of decisions finalized on the merits during the months of April to June 2018 by members having at least one year of experience as a member of the IRB. The study found 250 matching appeals which then formed the population of cases to examine. From that population, PEPM randomly selected 78 appeals to form the sample by 18 different experienced membersFootnote 1. Proportions within the sample were sized to approximate the actual composition of the population by region, claimant and Ministerial appeals, paper and oral appeals, and language of appeal. Sample results are considered accurate of the population within 8 percent, 9 times out of 10. However, the goal of this study was not to land on statistical certitude, but to identify areas of strength and concern to support senior management decision-making.

The following charts illustrate the sample makeup:


Text format

This is a bar chart including six bars.

The first bar is entitled ’78 Appeals Sampled,’ which describes the sample distribution of appeals across the IRB’s regions. There are six appeals from the Western region, thirty-seven from the Central region, and thirty-five from the Eastern region.

The second bar is entitled ‘Regional Proportion,’ which describes the proportion of appeals sampled in each region. Eight percent of the sample was taken from the Western region, forty-seven percent of the sample was taken from the Central region, and forty-five percent of the sample was taken from Eastern theregion.

The third Bar is entitled ’18 Experienced Members,’ which describes the distribution of members across the Canadian regions. Three were in the Western region, eight in the Central region, and seven in the Eastern region.

The fourth bar is entitled ‘Paper and Oral Proceedings,’ which describes the percentage of proceedings that that were conducted through paper or processed orally. One hundred percent of the proceedings were conducted through paper.

The fifth bar is entitled ‘Appellant,’ which describes the make-up of appellants. Ninety-nine percent of appellants were the claimant and one percent was the minister.

The sixth bar is entitled ‘Language of Appeal,’ which describes the percentage of appeals made in English and French. Seventy-five percent of appeals were processed in English and twenty-five percent were processed in French.

Limitations

The findings of this report are solely those of the evaluation team. Their observations are necessarily subjective in nature and do not lend themselves to firm conclusions on legal matters such as the correct application of the law, the weighing of the evidence, or the fairness of the proceedings from a natural justice perspective. Only a court reviewing the case can arrive at such conclusions. This report aims to provide a perspective to improve the Division’s performance overall. The study also relied on material protected under solicitor-client privilege. As such, some contents have been removed.

2.0 Performance Results

2.1 Proceedings Respect the Parties

Why evaluate this:

Individuals appearing before the IRB expect that they will be treated with sensitivity, respect and fairness. Any shortcoming in this regard potentially undermines tribunal integrity and public confidence.

What was evaluated:

Avg. Score out of 3 (target: 2.0)% of Cases Scoring at Least 2.0
#1 The member ensures parties have an opportunity to present and respond to evidence and to make representations. Paper and oral appeals2.175%

What the results say:

  • Question #1 assesses how the member proceeds when the member raises a new issue from the record or where a change in country conditions is relied upon. Notice to the parties is required when the RAD raises an issue not addressed in the RPD decision or in the appellant’s submissions.Footnote 2 The study found that​ in 12 appeals the member addressed a new issue, and in most cases the parties were notified with an opportunity to respond. As the evaluator observes of one case:

    Seeks clarifying submissions from counsel and relies on them in the decision. However, on credibility the member says that he will add concerns outside those raised by counsel if they are in the reasons OR were spoken of at the hearing.

  • However, in 3 cases, members made findings not raised elsewhere without first giving notice.

    Deals with an issue not addressed in the RPD decision or the [appellant’s] memo (identity of the organization in question), though the appeal turns on other issues. Also arguable that additional submissions should have been sought on state protection since an express finding was made on it.

    Makes extensive findings about a medical report being fraudulent based on its contents, some grammatical errors and prevalence of fraudulent documents in the area. No such finding was made at the RPD and the issue of the report possibly being fraudulent was not addressed in the A’s Memo.

2.2 Reasons are Complete

Why evaluate this:

The completeness of a member’s reasons is essential for parties to be able to understand the decision and to allow for meaningful review by the RPD or the Federal Court, as applicable. Through questions #2 to #15 on the following four pages, this study considers the structure of decisions, their adequacy and use of policy instruments and other tools of the RAD.

What was evaluated:

Avg. Score out of 3 (target: 2.0)% of Cases Scoring at Least 2.0
#2 The member succinctly summarizes the main issues.1.216%
#3 The member addresses the positions of all parties, if appropriate2.396%
#4 The member identifies the determinative issue(s) and writes only on the determinative issue(s).1.761%
#5 The member makes clear, unambiguous findings of fact.2.290%
#6 The member supports findings of fact with clear examples of evidence shown to be probative of these findings.2.395%
#7 The member addresses parties’ evidence that runs contrary to the member’s decision, and why certain evidence was preferred.2.296%
#8 The member identifies legislation, regulations, rules, Jurisprudential Guides, Chairperson’s Guidelines or persuasive decisions where appropriate.2.472%
#9 The member takes into account social and cultural contextual factors in assessing evidence.2.382%

What the results say:

Strengths

  • In nearly all decisions, members addressed the positions of the parties, supported findings of fact with probative evidence, and cited applicable law or policy instruments. The high percentage of members meeting the expected standard from #5 to #8 is consistent with or a slight improvement from the 2016 study. For further analysis, see the consultant evaluator’s report on pages 27-33.
  • In a notable area of improvement, 23 of 28 decisions met or exceeded the IRB standard of “2” for adverting to and filtering out potential socio-cultural biases from the assessment of the claim.

Text format

This is a bar chart depicting 28 bars. The 28 bars represent 28 decisions that took socio-cultural contextual factors into account when assessing a claim. Five decisions scored a ‘1’, 11 decisions scored a ‘2’ and 12 decisions scored a ‘3’.

Opportunities

  • Overview/summary: The study found an endemic under-use of the summary in all regions among the 18 members sampled.
  • Members are instructed on the importance of an executive summary that includes the outcome of the appeal and the member’s areas of focus. However, the study found that 66 decisions began with no statement or a bare statement of the issues. An example of a bare statement is, “The issues in this appeal are credibility and whether there is an IFA.”Footnote 3 In some cases, the reasons launched immediately into the analysis and revealed the decision only many pages later.

Text format

This is a bar chart depicting 78 bars that describe how many decisions had an effective executive summary of a decision. Sixty-six decisions scored a ‘1’. Eleven decisions scored a ‘2’. One deicision scored a ‘3’.

Recommendation 1: Most decisions did not feature an effective executive summary highlighting the member’s decision and key supporting reasons. These decisions tended to dive immediately into the analysis with a reveal of the final decision only much later. The RAD should consider a communiqué or professional development refresher to members on the use of an effective overview at the beginning of each decision.

  • Focus on determinative issues: Question #4 is new in this year’s study and assesses the extent to which members identified and focused on issues determinative to the appeal. Most decisions were found to have adequately kept to the determinative issues but 31 (40%) lacked such focus. The most common example was members writing a lengthy analysis on matters not germane to the issue on which the appeal was ultimately decided. An extreme example was a decision that “[identified] an issue as determinative after analyzing and determining it, but then goes on to deal at equal overall length with six other issues, variously stating that they are 'moot', 'not determinative' (3 issues) or 'not probative' (2 issues).”

Text format

This is a bar chart depicting the number of decisions that lacked focus on the determinative issues. Of the 78 decisions included, thirty-one decisions scored a ‘1’. Forty decisions scored a ‘2’ and seven decisions scored a ‘3’.

Recommendation 2: A The RAD should consider a communiqué or professional development refresher to members to identify and remain focussed on the determinative issues.

  • References: Earlier above the study found that 18 reasons (72%) appropriately cited the legal or policy authority relevant to the member’s decision. Whether it was case law, a Chairperson’s Guideline or Jurisprudential Guide, these reasons analyzed the authority and its relevance in a meaningful and explanatory way. Seven reasons (28%), however, only acknowledged a Guideline or case law generically rather than describe how or why it applied to the issue at hand. As the evaluator noted, “While there was less use of the unsupported statement that ‘I have fully considered [the applicable Guideline]’, there was also not much express indication that serious, analytical and purposive attention was being paid to a relevant Guideline.”

Avg. Score out of 3 (target: 2.0)% of Cases Scoring at Least 2.0
#10 The member uses plain language.1.871%
#11 The member gives appropriately clear and concise reasons.1.970%
#12 The reasons are easily understood and logically sequenced. 1.969%
#13 The reasons are as short and economical as possible taking into account the complexities of the appeal and volume of evidence.1.753%

What the results say:


Text format

This is a bar chart describing the 18 sampled members’ results for transparent and intelligible reasons. Six members scored a ‘2’ or higher and had satisfied expectations. Twelve members scored below a ‘2’ and their expectations for transparent and intelligible reasons fell short. The average score was 1.8.

  • Whereas earlier charts display results by decision, this chart shows decision-writing quality by member, as an average of each member’s scores from #10 to #13 across that member’s body of decisions
  • The number of decisions by each member varies from a low of 1 to a high of 11, reflecting the different levels of productivity
  • The study found 12 that members fell below the 2.0 target for not consistently demonstrating each of the elements in #10 to #13. For example, in examining the extent of plain language use, the evaluator found frequent instances of long and complex sentences and long paragraphs. Typical observations are:

    Extensive technical text on a matter not in issue detracts from the readability of the decision as a whole.

    The decision contains some very long paragraphs that create comprehension problems for the average reader. As well, unrelated issues are sometimes found in the same paragraph (even in some of the short ones), adding challenges to following the logic of the decision. Long sentences and the use of subordinate clauses would also create difficulties for many readers.

  • Furthermore, in examining the logical sequence of members’ reasons under #12, the evaluator noted:

    No real structure provided: just a chronological walk through numerous credibility issues. Headings help, but there is no guide to how the decision will unfold.

    Reasons can be hard to follow---seem to jump among issues and sub-issues, with four pages on analysis under a single generic heading without any sub-heads or clear organization around discrete issues.

  • By contrast, 6 members produced 18 reasons that were for the most part concise, organized, economical and in plain language.

    While more issues than necessary appear to have been addressed, each is given a clear resolution. While additional conciseness may have been possible, the length appears to reflect care to fully address concerns rather than careless drafting.

    Writes only on what is necessary for the decision; avoids lengthy analyses where the conclusions are clear. Resolves state protection and IFA in a single concise paragraph.

2.4 Decisions are Timely

Why evaluate this:

A timely decision contributes to resolving uncertainties among the parties and to meeting the IRB’s mission.

What was evaluated:

Average (Days)
% of Cases within 90 Days
#14 Among all 243 paper appeals decided during April to June 2018, the average number of days from appeal perfection to decision. (Federal Court returns excluded)2664%
#15 Among all 243 paper appeals decided during April to June 2018, the average number of days from member assignment to decision. (Federal Court returns excluded)39N/A

What the results say:

  • The average time to decide all appeals after perfection of the file was 266 days (excluding appeals returned by the Federal Court)
  • 85% of that time (n=227 days) consisted of various case processing activities by the Divisio​n, including digitization, triage and matching an available member
  • Once a member was assigned, the member was able to render a decision with reasons within 39 days
  • ​​​​A clear improvement since the last study is the faster processing of French-language appeals; these cases were assigned to a member within 215 days and decided 38 days later. This is a definitive improvement from 2016 when a shortage of members in the Eastern Region contributed to twice the processing time of French appeals as English appeals​

Text format

This is a bar chart consisting of four bars that describes the average number of days taken to process an appeal.

The first bar represents the national average of days taken to process an appeal. It took 227 days to assign a member and 39 days to render a decision totaling 266 days.

The second bar represents the western average of days taken to process an appeal. It took 249 days to assign a member and 26 days to render a decision totaling 275 days.

The third bar represents the central average of days taken to process an appeal. It took 218 days to assign a member and 38 days to render a decision totaling 256 days.

The fourth bar represents the eastern average of days taken to process an appeal. It took 233 days to assign a member and 42 days to render a decision totaling 275 days.


Text format

This is a bar chart consisting of two bars that describe the average number of days taken to process French and English Appeals. French appeals are processed faster than English appeals, which is a reversal from 2016.

The first bar graph represents French Appeal processing times. It took 215 days to assign a member and 38 days to render a decision totaling 253 days.

The second bar graph represents English Appeal processing times. It took 235 days to assign a member and 40 days to render a decision totaling 275 days.

2.5 Additional analysis

Why evaluate this:

The evaluation consultant developed and tested eight supplementary performance questions on reasons completeness, transparency and intelligibility that delve deeper into aspects of quality not directly addressed by one of the IRB’s performance indicators. Reasons were assessed against these supplementary indicators along the same numerical rating scale, but do not factor into overall results.

What was evaluated:

Avg. Score out of 3 (target 2.0)% of Cases Scoring at Least 2.0
#16 The reasons apply the appropriate tests for the admission of new evidence2.889%
#17 The decision considers all relevant issues from the record as appropriate.1.880%
#18 The reasons are structured around the issues (i.e. they are issue-driven).2.368%
#19 The reasons show a clear logic path to the result.1.737%
#20 The reasons are likely to explain the result to the subject of the appeal.2.888%
#21 The reasons appear to provide useful guidance to the RPD and other readers (e.g. on CanLII)1.757%
#22 The member uses strategies to achieve finality.2.485%
#23 The member conducted an independent assessment of the claim rather than a just review of errors made by the RPD. 2.997%

What the results say:

  • Comparison to 2016-17: Aggregate results met the IRB standard and are consistent with the 2016-17 results with respect to reasons apply the appropriate test for admitting new evidence, explain the result to the appellant, and carry out an independent assessment of the claim (#16, #20, #23).

    However, the IRB standard was not met for reasons showing a clear logic path to the result (#19). Rather than analyze and resolve each issue in a discrete package that propels the reader toward a foretold decision, most reasons still lacked a helpful overall structure.

  • Achieving finality: The RAD contributes to system efficiency and certainty among parties by bringing a claim to finality either by confirming or substituting a decision or returning the claim to the RPD with clear instructions. Of the 27 decisions where it was applicable, the study found that 23 times the member undertook strategies to achieve finality of a claim. These decisions were clearly based on a complete consideration of the evidence and the RPD record with a purposive strategy to substitute its own decision or direct the RPD when re-hearing the claim. Some common observations of the evaluator were:

    Goes to considerable pains to find and support a clear basis for substituting a positive decision, especially considering that the RPD decision was almost entirely based on identity and credibility.

    Having found that the RPD did not have a sustainable basis to dismiss the credibility of the [appellant], the RAD goes on to find its own basis for so doing, using a new matter and some of the RPD findings.

    No substitution option here as the merits had not been sufficiently canvassed at the RPD. Clear that the recording had been listened to for the purpose of determining whether substitution were possible and clear finding of why it was not.

    By contrast, only 4 decisions did not provide substantive directions when returning the matter to the RPD or did not explain why a substitution was not possible when it was appropriate to do so. Achieving finality is not considered a legal requirement but rather good tribunal practice.

2.6 Analysis of Select Appeals Involving Chairperson's Guideline 9: Proceedings before the IRB Involving Sexual Orientation and Gender Identity and Expression

Why evaluate this:

Chairperson's Guideline 9 was released in May 2017 to address the particular challenges that individuals with diverse SOGIE may face in presenting their cases before the IRB. It also establishes guiding principles for members in adjudicating cases involving SOGIE. To determine how members of the RAD are applying Guideline 9 in its early implementation, the study’s sample included 9 appeals (12%) involving SOGIE claimants. Given the small subsample size, the object was to achieve an early understanding of how members engage Guideline 9, and to refine the analytical framework for future iterations.

PEPM provided the evaluator with 13 performance indicators designed to assess the application of Guideline 9 in all four of the IRB’s divisions. The indicators are premised on specific sections of Guideline 9 that are framed as imposing an obligation on members to follow, save for compelling or exceptional reasons. For example, sections 5.3 and 7.3 state respectively that “decision-makers should, wherever possible, avoid the use of personal identifiers or sensitive information that is not necessary to explain the reasoning in the decision” and “Questioning should be done in a sensitive, non-confrontational manner” (emphasis added). Armed with a series of such questions (see Appendix II), the goal was to carry out an objective assessment of Guideline 9 compliance.

What the evaluator found was that most of the 13 questions do not apply in the RAD, as those questions pertain only to oral hearings. Questions such as accommodation of vulnerable witnesses, the tone and demeanour of parties or questioning in a sensitive manner, were not evaluable. Rather, Questions #28, #29 and #31 were found to have the highest relevance in the RAD:

YesNo
#28 Whenever possible, did the decision-maker avoid the use of personal identifiers or sensitive information that is not necessary to explain the reasoning in the decision34
#29 Did the decision-maker rely on stereotypes or inappropriate assumptions?17
#31 If there were inconsistencies or omissions in the individual’s evidence, did the decision-maker examine whether there were cultural, psychological or other barriers that may reasonably explain them?41

What the results say:

  • Protection of sensitive information: Section 5.1 of the Guideline provides that “even though proceedings before the RPD and the RAD are private, if a case is before the Federal Court for judicial review, the information in the Federal Court file pertaining to the case becomes publicly accessible.” The study found, however, that 4 reasons identified the names of the claimants’ same-sex partners still living in the country of reference. One decision also named a state official. These reasons revealed potentially sufficient identity information to out a person or pose a risk of harm. Of one decision, the evaluator observed:

    Decision uses the last name of one same-sex partner and the first name of another, as well as the full name of an official in the relevant country, information which if connected to the [appellant] could reveal their identities in a country where same-sex relationships carry a high risk of adverse state, family and social consequences.

    Recommendation 3: Members should be reminded that when drafting reasons for decision, wherever possible, personal identifiers or sensitive information that are not necessary to explain the reasoning in the decision should be avoided in the event the reasons become publicly accessible upon judicial review.

  • Stereotypes and inconsistencies: Most decisions avoided stereotypes or inappropriate assumptions when making findings. Most decisions also considered cultural, psychological or other barriers to explain inconsistencies in the evidence. The RAD generally not only avoided stereotypes and inappropriate assumptions in its decisions, but found errors in the RPD for precisely these actions.

    For a more detailed analysis of SOGIE cases and its evaluation instrument, see pages 33 to 40 of below.

2.7 Summary

This study recruited an expert consultant to independently review 78 randomly selected appeals finalized by experienced members during fiscal year 2018-19.The review included a subsample of appeals involving SOGIE. The study aggregated the consultant’s findings to discern trends and patterns in decision-making to support performance management decisions. Specific observations made by the consultant are described in the next part of this report.

Overall, the study calculated an aggregate score of 2.1 out of 3.0 for the Division, meeting the target established by the IRB. This composite rating, however, masks demonstrated strengths and opportunities in underlying areas summarized as follows:

  • In nearly all decisions, members addressed the positions of the parties, supported findings of fact with probative evidence, and cited applicable policy instruments or law.
  • Most decisions were alert to and filtered out potential socio-cultural biases from the assessment of the claim.
  • When it was appropriate to do so, most members undertook strategies to achieve finality in a claim whether by substituting the decision or remitting the claim with directions to the RPD.
  • The once lengthy processing time of French-language appeals has been corrected.
  • Members widely avoided stereotypes and cultural assumptions in SOGIE cases.
  • • A few reasons unnecessarily named same-sex partners in the country of reference.
  • • There was wide underuse of the executive summary at the opening of the decision.
  • • 4 in 10 reasons could have stayed focused on just the determinative issues.
  • • Most members’ reasons fell short of being consistently clear, concise, organized, and in plain language.
Recommendations:
  1. The RAD should consider a communiqué or professional development refresher to members on the use of an effective overview/summary at the beginning of each decision.
  2. The RAD should also consider a communiqué or professional development refresher to members to identify and remain focused on the determinative issues.
  3. Members should be reminded that when drafting reasons for decision, wherever possible, personal identifiers or sensitive information that are not necessary to explain the reasoning in the decision should be avoided in the event the reasons become publicly accessible upon judicial review.
  4. The consultant recommended that PEPM review of the checklist questions and their interpretive guide (see Appendix I) to improve them as an assessment instrument for the next evaluation.

Appendix A - Performance indicators and rating scale

A. Respectful proceedings
1The member ensures parties have an opportunity to present and respond to evidence and to make representations.1 2 3
B. Reasons are complete
2The member succinctly summarizes the main issues.1 2 3
3The member addresses the positions of all parties, if appropriate1 2 3
4The member identifies the determinative issue(s) and writes only on the determinative issue(s).1 2 3
5The member makes clear, unambiguous findings of fact.1 2 3
6The member supports findings of fact with clear examples of evidence shown to be probative of these findings.1 2 3
7The member addresses parties' evidence that runs contrary to the member's decision, and why certain evidence was preferred.1 2 3
8The member identifies legislation, regulations, rules, Jurisprudential Guides, Chairperson's Guidelines or persuasive decisions where appropriate.1 3
9The member takes into account social and cultural contextual factors in assessing evidence.1 2 3
C. Reasons are transparent and intelligible
10The member uses plain language.1 2 3
11The member gives appropriately clear and concise reasons.1 2 3
12Reasons are easily understood and logically sequenced.1 2 3
13The reasons are as short and economical as possible taking into account the complexities of the appeal and volume of evidence1 2 3
D. Decision is rendered as soon as practicable
14Number of days after file was assigned to the member to render the decision.
15Number of days after the appeal was perfected to render the decision.
Supplementary questions
16The reasons apply the appropriate tests for the admission of new evidence.1 3
17The decision considers all relevant issues from the record as appropriate.1 2 3
18The reasons are structured around the determinative issues (i.e. they are issues-driven).1 3
19The reasons show a clear logic path to the result1 3
20The reasons are likely to explain the result to the subject of the appeal.1 3
21The reasons appear to provide useful guidance to the RPD and other readers (e.g. on CanLii)1 2 3
22The member uses strategies to achieve finality.1 2 3
23The member conducted an independent assessment of the claim rather than a review of errors made by the RPD.1 3
E. Rating guide
  1. Needs Improvement:
    The quality requirement was not met. The evidence showed one or more key instances where the proceeding or reasons would have markedly benefited had this requirement been met. There may have been an effort to apply the requirement but the level of achievement fell short of expectations.
  2. Meets Expectations:
    This is a level of acceptable achievement. On balance. the decision-maker satisfied this quality requirement though there is margin for minor improvement.
  3. Above Expectations:
    This is a level of consistent acceptable achievement. The evidence shows a grasp of the quality requirement and an understanding of its importance to a high-quality proceeding or decision. as the case may be.

Appendix B - SOGIE analytical framework


YesNo
24. Did the member consider any accommodations under the Chairperson’s Guideline 8, if appropriate, whether requested by a party or on the member’s own initiative?
25. If an individual asserted an independent appeal based on sexual orientation or gender identity or expression, did the member consider separation of joined appeals, if appropriate?
26. Did the member address and refer to the individual by their chosen name, terminology, and pronouns?
27. If there were any issues about a participant’s conduct in a proceeding, including tone and demeanour, or any misunderstandings about the use of appropriate language, did the member address those issues as soon as they arose?
28. Whenever possible, did the member avoid the use of personal identifiers or sensitive information that is not necessary to explain the reasoning in the decision?
29. Did the member rely on stereotypes or inappropriate assumptions?
30. Was questioning done in a sensitive, non-confrontational manner?
31. If there were inconsistencies or omissions in the individual’s evidence, did the member examine whether there were cultural, psychological or other barriers that may reasonably explain them?
32. Did the member consider intersectional factors such as race, ethnicity, religion, faith or belief system, age, disability, health status, social class and education when determining whether an individual has established a well-founded fear of persecution?
33. Did the member exercise caution before drawing negative inferences from discrepancies in gender identification documents?
34. If the case involves a minor with diverse SOGIE did the member consider the application of Chairperson's Guideline 3: Child Refugee Claimants—Procedural and Evidentiary Issues, if appropriate?
35. Did the member consider laws of general application that are used to target individuals with diverse SOGIE?
36. If in the country of reference there is a lack of documentation reporting on the treatment of individuals with diverse SOGIE, did the member consider the circumstances in the country that may inform the absence of such documentation?

Evaluation report of Doug Ewart, LL.B, LL.M.

Introduction to the Report

This Report offers an overview of the 2018 quality assessment of Refugee Appeal Division (RAD) decisionsFootnote 4. It is the second such assessment since the creation of the Division. It was conducted by the same quality assurance evaluator and its purpose remains the same as in the initial (2016) assessment. That purpose is to provide an independent and objective assessment of a sample of RAD decisions using the detailed evaluative checklist developed by the Board.

This checklist-based assessment of individual RAD decisions generates the information for the aggregated decision quality analysis prepared by the IRB’s Planning, Evaluation and Performance Measurement Directorate. It is that aggregated analysis, rather than this Report, that serves as the basis for the Board’s determination of areas where decisions as a whole are in compliance with Board expectations, and areas where steps should be taken to improve the quality of RAD decisions.

Accordingly, the main product of this assessment was the completed evaluation checklists that have been provided electronically to the Directorate. This Report supplements that database by outlining the process for, and contextualizing, the individual assessments. The Report also offers comments on issues that arose frequently in the decisions reviewed, along with suggestions for improvements to the evaluation instrument. It does not purport to draw conclusions about the quality of RAD decisions as a whole.

This year there was also a distinct review of RAD decisions in cases that involved, or should have involved, consideration of the SOGIE guideline. That review assessed those decisions using a new checklist developed by the Directorate for the purpose. In addition, the evaluator was asked to reflect on and prepare an analysis of the design and utility of that checklist for assessing the application of the Guideline by RAD Members. Both the specific evaluations, and the assessment of the evaluation instrument, are discussed in Part II of this Report.

A Note of Appreciation for Directorate Staff

As was the case in 2016, this review was significantly aided by the staff of the Directorate. They are of course responsible for the design of the assessment process and evaluation instrument. As well, they unfailingly provided very valuable advice, guidance and support throughout the review process. They were unstinting in sharing their expertise, insights and time and made the entire process a more productive one.

Evaluation of 2018 paper appeal decisions

1.0 The assessment process

1.1 Overview

This assessment of RAD decisions was conducted on a part-time basis between October 2018 and January 2019. Seventy-eight RAD decisions were assessed, using the Board’s 24-item checklistFootnote 5. Just over 10% of those decisions were also assessed for compliance with the SOGIE Guideline.

The 2018 version of the checklist adds two questions to the 2016 version, but otherwise contains only minor variations from that instrumentFootnote 6. The application of the checklist to individual decisions was shaped by the interpretative guide to the checklist that was developed for the first RAD decision quality review in 2016, again with only modest changesFootnote 7.

The decisions to be assessed were drawn from all three Regional Offices. The number of files from each Office reflected that Office’s proportion of the appellate caseload for the period under review. The national distribution between proceedings in French and English was also respected in establishing the pool of decisions to be assessed. Unlike in 2016 there was no assessment of RAD oral appeals as the pool contained too few oral appeal files to permit meaningful generalizations (2 of 78).

The methodology for the assessments was the one developed in 2016, when a number of different approaches to reviewing the files were tested on a quality/cost continuum. Based on that experience, the process employed for 2018 was to first read and make notes on the RAD decision, followed by a review and noting-up of the RPD decision and the Appellant’s memorandum. In the few cases where there was an intervention by the Minister his memorandum, and any response, were also reviewed. It was only rarely necessary to consult other documents on the file.

The RAD decision was then re-read, often more than once, as the checklist was completed. As a final step each completed checklist was reviewed as a whole to enhance the accuracy and consistency of the application of the various, and sometimes overlapping, evaluation criteria.

It is important to note that the assessment criteria do not raise any issues about the correctness or reasonableness of a decision. They address only matters of the completeness and clarity of the reasons. The result reached is – quite properly – outside the scope of the assessment.

As the checklists were completed, they were posted to a shared drive to permit staff of the Planning, Evaluation and Performance Measurement Directorate to offer comments. As noted above, the Directorate ultimately uses the final checklists as the basis for their own statistical and other analyses in the quality performance report they prepare for the Board.

The [sampled] decisions provided for review did not reflect an even distribution among RAD Members, particularly in the Central Region. The Directorate has statistical and other analytical methods to compensate for this somewhat skewed distribution in their aggregated report to the Board. Since the evaluator lacks those particular skills, any generalizations found in this Report should be treated with caution.

1.2 The importrance of the detailed assessment criteria

As noted in 2016, the evaluation checklist was of enormous value and significance in the assessment of decisions. It provided structure for, and tightly disciplined, the assessment of each decision. By requiring a focus on specific indicators of decision quality, rather than on just an overall impression of a decision, the checklist inexorably drives a reasoned – and transparently justified – basis for each individual assessment. It thereby adds both quality and credence to the review process as a whole.

In my experience it was very often the case that the initial assessment of a decision changed when it was considered against the criteria. More granularly, use of the criteria demonstrated that almost every decision had both good and challenging aspects. A decision might be written in a run-on and somewhat disorganized fashion, but also show a clear and well-supported determination of specific issues. In the same vein, though more rarely, a well-structured decision might fail to explain or support a key finding.

Given that the purpose of the exercise is to support the Directorate’s aggregated determination of the particular areas where the Board as a whole can take confidence in its work, or should seek to improve, the checklist approach is both essential and effectiveFootnote 8.

1.3 The importance of written comments

While the checklist is built around numerical ratings for the assessment criteria, it also permits the inclusion of written comments for each rating. The comments, for which there are no character limits in the form, can be used explain a rating. They can also be used to justify a rating that might seem anomalous or inconsistent with ratings on similar questions in the checklist. As well, they permit the evaluator to flag issues of interest and/or matters for future consideration.

Adding comments improves the checklist assessments in at least two ways. First, they discipline the evaluator. In my experience the preparation of a comment quite frequently changed the initial numeric rating or led to a reconsideration of other, related ratings.

Additional discipline came from my practice of approaching each decision as if I were in a dialogue with its author in which I needed to be able to justify my assessment. Simply providing a numeric rating alone, especially on a relatively un-nuanced 2 or 3 point scale, is not conducive to internalizing the assessment as a dialogue: explaining in writing a particular rating is.

Second, written comments provide useful information for Directorate staff as they convert the completed evaluation forms into an aggregate-based analysis. It allows them to question or qualify ratings, and to see if the approach was consistent across all 78 decisions. As well, it is apparent from the Directorate’s own reports that they find it useful to elaborate on their aggregated findings by citing comments to help the Board appreciate the conclusions they are putting forward.

1.4 The value of the interpretative guide

Part way through the 2106 review it became apparent that some of the criteria were challenging to interpret and apply, especially in relation to other similar criteria (see discussion below). As a result, an interpretative guide was developed to help address those issues.

The guide proved to be extremely useful in applying a consistent and reasonably distinct approach to each question. It also appreciably speeded-up the assessments by reducing the need to repeatedly puzzle over distinctions among similar criteria, or the relationships among more diverse ones.

For the 2018 assessments, the Board approved the use of the 2016 guide, enhanced with the use of some of the suggestions about the checklist itself that were made in the 2016 Report. The resulting instrument is set out in Appendix I.

2.0 Comments on the Assessment Instrument

Effective as the current approach to assessments is, the existing checklist has aspects that could be improved. My 2016 Report contained a relatively detailed analysis of the checklist, combining comments on issues that arose in the application of various questions with specific suggestions for changes. My 2018 review found many of the same issues with the checklist and, for me, confirmed the ongoing relevance of the 2016 comments and suggestions.

2.1 The value of the interpretative guide

The numeric ratings

The rating scales were sometimes problematic. The most common scale, consisting of 3 points, made for stark choices that often precluded nuance and required fairly different decisions to be rated the same. Having options in addition to (colloquially) failing, acceptable and exemplary would seem to support more meaningful conclusions when the data are aggregated.

While they were less common, some questions only provided for a binary response. They posed special challenges in some instances, as noted in my 2016 Report.

The allowance for written comments provides a way to address some of these concerns, but for the most part the comments are seen only by Directorate staff. Some are included in the Directorate’s report to highlight certain matters, but they do not affect the average numerical rating that is determined for each question. To the extent that it is the rating and not its context that drives responses to the Directorate’s analysis there remains the potential for the assessment to be less helpful than it otherwise might beFootnote 9.

Challenges in applying certain checklist criteria

As will be discussed in more detail below, certain of the checklist criteria proved challenging to apply, especially in relation to each other. For example, it was not always easy to distinguish between a criterion of “appropriately clear reasons” and one that looks at whether the reasons are “easily understood”, or between a criterion of “appropriately concise” reasons and a criterion of reasons being “as short and economical as possible”. And, once the supplementary questions were added there was some additional overlap such as that between considering whether a decision was logically sequenced and whether it demonstrated a clear logic path to the result.

Additionally, a number of the criteria look to how evidence and questions of fact are addressed by the RAD, but none raise the issue of how well the RAD performs in assessing whether the RPD erred on matters of fact or law. This reduces the value of that aspect of the checklist in assessing the work of an appellate body.

2.2 Examples of ways in which the checklist could be improved

Questions that could be reorganized and combined

As set out in detail in my 2016 comments, a number of the checklist criteria appear to overlap or at least to create some uncertainty around the exact issue they address or how that issue relates to others in the checklist. Even with the help of the interpretative guide, and the experience of a previous assessment process, the 2018 review often required some juggling among the criteria and/or the preparation of comments to explain ratings that could seem incongruous or contradictory.

As suggested in 2016, consideration might be given to reorganizing and re-ordering five of the main checklist questions (in the 2018 checklist, questions 6, 11, 12, 13, 14) plus now the new question 5 on the determinative issue, along with supplementary questions S3 and S4. Subject to the views of those with evaluation methodology expertise, it seems that uncertainty and overlap could be reduced, and the essence of these eight questions captured, with the six questions below:

  • Focus on determinative issue: does the decision identify and only address the determinative issue(s)?
  • Clear writing: does the decision use a reasonably simple and straightforward vocabulary; show an attempt to explain concepts and terms; and avoid long sentences, subordinate clauses and long paragraphs?
  • Clear reasoning: does the reasoning on each separate issue clearly and concisely support the conclusion reached on that issue?
  • Clear rationale: does the analysis as a whole clearly explain and support the result in the appeal?
  • Clear logic path: do the reasons demonstrate a clear and logically-sequenced path from the issues examined to the result overall? Is the path articulated for the reader?
  • Brevity: is the decision as concise as reasonably possible given the issues and the positions that needed to be addressed?

Questions that might be reworded to better reflect the role of the RAD

As I understand the history of the RAD checklist, it was adapted from the checklist used to assess the RPD’s first instance decisions. For the most part this works well, and raises no major issues.

However, the focus in questions 6-8 on fact finding and the assessment of evidence does raise issues, given that the quality of an appellate decision is as much about the author’s consideration of the reasoning and conclusions below as it is about fact-finding and the assessment of the evidence. As a result, in a significant number of cases the checklist’s focus on the RAD’s factual findings does not permit a complete assessment of the quality of a RAD decision.

This can to some extent be addressed by nuancing the questions via the interpretative guide and by the use of comments. Nonetheless some dissonance inevitably remains between the numerical ratings and the actual quality of the appellate decision. If it is felt that there would be value in having an assessment of the RAD’s reviews of RPD reasoning and conclusions, consideration might be given to revising questions 6, 7, and 8 roughly as follows:

  • Question 6 might be expanded to refer as well to the RAD making clear findings of errors.
  • Question 7 could go on to ask whether the RAD’s findings of errors are supported by clear analyses and justifications.
  • Question 8 could also ask whether arguments or authorities contrary to a finding of error have been addressed.

Other more technical matters

In the course of the Directorate’s review of the completed checklists, two technical issues arose that might suggest changes in either the checklist or in Board practice.

The first was the challenge posed by the binary choice between finding that a RAD decision confirming a RPD decision did so for the same or for different reasons. Sometimes this is indeed clear, but on other occasions the RAD will have found errors in the RPD approach to a ground, but ultimately have found that the RPD determination on that ground was correct.

More commonly, the RAD may overturn a number of RPD credibility findings, but still confirm the overall credibility finding and dismiss the appeal on that basis.

It would seem to be useful to distinguish these kinds of decisions in order to assess the intensity and thoroughness of the RAD’s reviews of RPD decisions. If the Board sees value in this, then a third assessment option could be added. This would create a range like the following:

  • Confirmed for the same reasons.
  • Confirmed for modified reasons.
  • Confirmed on a different basis.

The second matter has to do tracking how often the RAD provides directions to the RPD when a matter is returned to it. This again can be a way of assessing how thorough an approach is being taken to using the statutory powers given to the RAD.

For the purposes of the assessments undertaken in this review, as set out in the interpretative guide, a decision is only rated as providing directions if such are given on a matter of substance that could shape and/or expedite the rehearing. A direction that the matter be heard by a different panel is accordingly treated as there being no direction.

It appears that Registry staff take a different approach such that a direction that the matter be heard by a different panel is treated the same as a substantive direction.

For reasons of clarity and comparability, the Board may wish to direct that either those assessing decisions, or Registry staff, change their approach.

3.0 General observations on RAD decision writing

3.1 Context

As noted above, it is the extrapolations from the Directorate’s statistical and qualitative analysis of the checklists that the Board will rely on in determining whether to pursue initiatives to improve decision quality. Nonetheless, I was asked to provide observations based upon a sustained engagement with the decisions, and have set out some thoughts below.

Those thoughts should be read with the key caveat noted earlier: the distribution of decisions in the pool renders generalizations problematic without the deployment of the kinds of statistical and analytical expertise enjoyed by the Directorate. Related to this is the reality that these observations are being made in advance of the Directorate’s analysis, and so stand to be corrected by its conclusions on the aggregated ratings and the extent of deviations from the expected norm.

The observations should also be read in light of the author’s recognition that, as noted in 2016, time pressures on RAD members are intense. There is only limited time for rewriting or clarifying a decision. Comments that appear critical should be read in that context, although ideally changes in decision-writing practices that follow from these or other comments will, over time, expedite rather than complicate decision writing.

It is also recognized that in some cases the RAD Member is given very little assistance by the RPD decision and, in many more cases, very little assistance by the Appellant’s arguments. This too needs to be taken into account in reading comments on RAD decisions.

3.2 Examples of improvements since 2016

At an impressionistic level it would seem that quality has improved since 2016. In particular, it seemed that issue-focused decisions constituted a higher proportion of the pool; decisions were shorter; and legal dissertations were fewer.

Issues focus

It seems that increasingly decisions are being written around the specific legal and factual issues that needed to be resolved to determine the appeal rather than around the events that led to the claim or around general discourses on the proceedings below.

This matters because, as the decisions considered this year showed, decisions that are written around the issues are inherently more economical and clear. They are easier for the lay reader to follow because of their relative brevity and because they have a greater tendency to introduce, discuss and resolve discrete issues in self-contained analyses.

While not universally the case, Members who use the issue-driven approach also generally avoid starting the decision with distinct, and often sequentially inconsistent, summaries of the facts, the RPD decision and the Appellant’s arguments.

Instead, having first identified the key issues, these Members more often integrate specific parts of the background information into the issue analysis to which they are most relevant. In this way, everything the reader needs to understand the analysis is in one place: there is no need to flip back and forth between different parts of a decision to understand what is going on.

Decisions that do not take this approach, and there are still more than a few, end up slowing the reader down and creating doubt about what will be in Issue. They do this in one or more of three ways. They confuse the reader because much of the material in the overviews ends up not being used in, or being relevant to, the decision itself; they add the barrier of length by repeating the relevant upfront information in the analysis; or they impose the ‘flip back and forth’ burden on the reader.

Tighter decisions

Again at an impressionistic level, there seemed to be fewer discursive decisions than in 2016. Instead, more decisions in this pool reflected a tighter writing style and fewer went on at questionable length.

There were nonetheless some long decisions in the 2018 pool. Some seemed carelessly so, while a few reflected thoroughness and a desire to explain rather than a lack of focus. Those in the latter group could still have been shorter, but it is difficult to fault, from a decision-writing perspective (as opposed perhaps from a Board efficiency perspective), someone who, in a clear and well-structured decision, writes at length to carefully support and explain their findings.

Fewer legal dissertations

One reason behind the trend to shorter decisions was the fact that lengthy legal dissertations were rare. Overall there was, in this pool of decisions, a much more focused use of caselaw. Where caselaw was used, it was more common than in 2016 to state or briefly paraphrase the point of the authority rather than to paste-in a lengthy quote. It also seemed more common to simply cite a buttressing authority in a footnote.

Some Members have honed their writing to the point where instead of starting an analysis with a full description a legal standard they instead raise each aspect of the standard only in relation to the evidence or sub-issue to which it pertains. This makes the application of the law to the particular case much clearer and easier to follow than when the full standard is set out at the start.

Several Members have developed quite elegant two or three-line synopses of the law on matters like new evidence, oral hearings, the role of the RAD, credibility assessments and IFA. These convey the law accurately, and in a way that most readers can follow.

Frustratingly, others had similar synopses, but put them at the end of a multi-paragraph discourse on the law. Yet others, though a minority, still rely on a full discussion of the law, complete with quotes from authorities, even where the issue is a routine and long-settled one.

3.3 Some ongoing concerns

In this section, brief mention will be made of some additional areas where problematic decision-writing practices, most but not all quite isolated, have continued on from the decisions reviewed in 2016.

Struggles with plain writing

Plain writing continues to be a struggle for quite a few Members. The issue here is not striving for a true plain language standard, but rather removing some of the more common barriers to easy comprehension.

These include sentences that go on for four or five or more lines, often using complex subordinate clauses, and paragraphs that go on for half a page or so, often containing unrelated mattersFootnote 10. The concept that ‘paragraphs separate ideas’ seems to be somewhat foreign to some Members.

Other matters that detracted from the readability of RAD decisions include the use of technical language; not explaining concepts or standards; using extensive quotes from authorities and documents; and including generic text not needed for or relevant to the decision.

Lack of effective summaries

Another problematic area that is far from isolated is the failure to provide a summary of the decision. While perhaps not the most important area in which the evaluation criteria were not met, it was by far the most pervasive.

Effective summaries – those that told the reader where the decision was going and how it was going to get there – were rare to non-existent. This is the case despite the fact that Board standards have called for effective summaries for a long time, and [ ]Footnote 11.

The almost universal lack of summaries meant that there were multiple decisions in this review where the analysis was read with no idea of why certain matters were being discussed, or how a particular part of the analysis related to the outcome, or where the analysis as a whole was heading. All-too-often the basis for the result, and the reasoning behind it, came as a complete surprise. Telling the reader in advance what they can expect to find is a fundamental element of making decisions understandable.

Those who teach decision writing often suggest that a summary should be prepared once the outcome has been determined but before the decision is written. This is said to focus the author’s thinking and discipline the writing process, saving time as well as intrinsically inducing clarity.

But even where a Member finds that approach unproductive, it should at least be possible to take a few minutes after the decision is written to briefly synopsize it. The result will always assist the reader and may, in some cases, indicate to the decision-maker that they have not been as clear in the decision as they wanted to be.

Inclusion of unhelpful text

The 2018 pool contained a few examples of the issue identified in 2016 where the Member seems to be demonstrating and describing their deliberative process rather than explaining and supporting the result they have reached. Writing towards a decision that will end up being made, rather writing from, and to explain, a decision that has been made, adds length and creates barriers to easy comprehension of the basis for the result.

Related to this is the occasional practice of leaving in a decision text that was written as the member worked his or her way to the result, but which then became irrelevant given the basis on which the result was arrived at. For example, the Member may have sketched-out background material to help them understand the facts, the law, the RPD decision and/or the Appellant’s concerns, but then not reconsidered whether parts of those sketches were extraneous to the actual decision. Similarly, there might be quite a lengthy review of credibility or other factual findings or of issue-specific legal questions, only to have the decision turn on completely different matters.

Problematic use of pre-existing text

While it is often an efficient use of time to paste-in text from other decisions, there were decisions in which this was done without regard to whether parts of that material addressed factors that have no relevance to the decision being written.

Examples include a full analysis (one page) of the Raza tests when the proffered evidence was rejected on ss. 110(4) grounds alone; discussion of the test for an oral appeal when no new evidence had been admitted; or discussion of the test for deference when the need to consider whether to defer never arose in the decision. This not only creates a needless barrier for the reader, but as well can suggest a generic decision rather than one focused on the particular case.

Excessive use of quotations

While less prevalent than before, lengthy quotes from authorities – whether legal decisions, statutes or summaries of the laws or policies of other jurisdictions – continue to feature in a number of decisions. The use of extensive quotes from authorities or official documents creates a serious barrier for the lay reader of the decision: they can be highly legalistic and can deal with complicated issues that are not central to the point for which they are being cited. As well, they tend to be declaratory rather than explanatory.

Lack of structure

It was not uncommon to find several issues discussed at length under a one word, generic heading such as ‘Analysis’ or ‘Credibility’. This provided little-to-no guidance as to where the reader was at in the analysis, and none as to where it was going. Sub-headings were relatively rare.

There were some examples of the use of descriptive headings and/or sub-headings, but only one decision followed the Board’s writing guide by using conclusion-focused headingsFootnote 12. Overall, the adage that “paragraphs separate ideas: headings distill them” seems to be given little credence by RAD Members.

Finally, RAD decisions did not demonstrate much attention to other ways of showing the reader how the issues being dealt with fit with and connect to each other and how they lead to and support the result. To cite just one example, there was almost no use of transition sentences to guide the reader from the conclusion on one issue to the start of the next issue.

Lack of active engagement with applicable Guidelines

With the exception of the new SOGIE Guideline, it was rare to find a decision that actively engaged with and explored the potential value of a Guideline, even one whose relevance to the decision was expressly noted. While there was less use of the unsupported statement that “I have fully considered [the applicable Guideline]”, there was also not much express indication that serious, analytical and purposive attention was being paid to a relevant Guideline.

This could be because for the long-standing Guidelines most Members have, or feel they have, internalized them to the extent that their application is more or less automatic. But even where that is the case, a cursory mention does little to show the reader how and to what extent the Guideline affected to the decision, or explain why it was found to have no impact on the result.

And, of course, there is also the possibility that over time an assumption about what a Guideline says might replace, in the mind of an adjudicator, what it actually says.

This may suggest that for certain key Guidelines the Board could usefully undertake a specific sub-study of how they are being applied in practice. The sub-study of the SOGIE Guideline, if it is found to have produced useful information, could be a model for such a review.

Limited use of jurisprudential guides

There was limited use of the jurisprudential guides. Only one decision seemed to take full advantage of a guide by directly relying upon it within the decision, including by noting that it had properly addressed the authorities cited by the Appellant in the case at hand. This produced a more concise decision and also helped promote – and demonstrate – consistency.

Overall this would seem to be an area where both the quality and economy of decisions could be improved.

Decision-writing [ ] not used

[ ] that would seem to constitute an excellent, practical, and subject-matter-specific resource for Members. It is not clear to what extent, or with what regularity, Members are given refreshers [ ], but the Board could usefully consider whether this should be a relatively frequent occurrence over the career of an adjudicator.

4.0 A concluding comment on aggregated ratings

As noted earlier, the Board relies on aggregated indicators to assess the quality of RAD decisions. This approach has its own justifications. Conclusions drawn from the aggregations are clearly useful for identifying aspects of decision writing that the RAD as a whole should consider as targets for improvement. And, reliance on aggregates protects the anonymity of the decision-writers.

However, analyses drawn from aggregated data do not address the reality that ‘averages hide a multitude of sins’. Even if most Members are meeting or exceeding a given standard, such that the rating indicates all is well, the fact may well remain that certain Members are consistently not meeting the standard. If their decisions are flawed, the subjects of them will draw little solace from the fact that most Members are doing much better.

This raises the issue of whether the Board should consider a more individualized assessment of Members’ decision writing skills.

An assessment of decisions concerning Chairperson's guidelin 9 and of the evaluation instrument developed for that purpose

1.0 Introduction

Chairperson’s Guideline 9 on proceedings involving sexual orientation and gender identity and expression [SOGIE] came into force on May 1, 2017. It was thus in effect for all of the RAD decisions assessed in this review, although it was not in force for all of the RPD decisions that the RAD considered.

The Board has developed a separate evaluation instrument (checklist) for cases in which the SOGIE guideline was, or should have been, considered by the RAD. In addition to assessing RAD decisions against the criteria in that assessment instrument, I was also asked to assess the instrument itself for its utility in determining if the Guideline is being properly given effect to.

Assessed against the standards in Guideline 9, most of the RAD decisions in this sample that involved the SOGIE Guideline were in overall compliance with it, albeit with some significant exceptions. However, I found the evaluation instrument itself to be of limited assistance in reaching that determination.

This Part of the Report will start with some brief observations on the application of the Guideline by RAD Members. This is intended as only an impressionistic supplement to the core of the assessment, which is found the completed checklists submitted separately to the Directorate. As with the main body of the work, the checklists will be the starting point for the Directorate’s formal evaluation of the application of the Guideline.

This Part of the Report will then move to an analysis of the checklist itself and ways in which it might be made a more effective assessment instrument for assessing RAD decisions. It will conclude with a brief comment on the potential application of a revised checklist to matters before the RPD.

2.0 Preliminary observations on RAD decisions that applied the guideline

2.1 Approaches the Board may wish to enforce

More than one decision showed a serious and contextualized engagement with the Guideline. This included articulating the need for sensitivity in applying the Guideline and going on to make it apparent that the Member knew its content and why it had been created.

These and several other decisions took care to cite particular aspects of the Guideline in dealing with the specific issues to which they related, rather than providing an introductory overview and then making findings as issues arose. This integration of the text of the Guideline into the discussion of specific issues or concerns helps ensure an accurate consideration of the Guideline. It also provides the reader with an easy way to grasp how the Guideline has affected the RAD’s reasoning and conclusions.

Several decisions demonstrated a serious appreciation of the social, mental health, family and cultural impediments to both acting on a realized sexual orientation and to giving compelling testimony on the issue. (See Guideline 3.6)

Decisions also demonstrated an appreciation of what is behind Guideline 3.2 (the Appellant may not have other evidence of their orientation, identity or expression) by noting that this means that more reliance should be placed on the Appellant’s testimony even where there are credibility problems in the supporting documents.

As well, decisions applied the principle found in the applicable Decision of Interest (MB5-03341) that adverse credibility findings on matters not related to sexual orientation should not reflexively be used to undercut the claimant’s credibility on the issue of their own orientation.

Other positive aspects of decisions that considered the Guideline included:

  • Effective critiques of the use of stereotypes at the RPD, including in a case where the Guideline had not been applied because the claimant’s sexual orientation was not disputed. In that case the RAD criticized the RPD for making a stereotypical assumption that the claimant’s girlfriend would necessarily know about other aspects of his sexual identity.
  • Clear explanations of the Guideline and its application.
  • Application of both Guidelines 8.5.11 and 8.5.12 when a sur place claim was in issue.
  • Engagement with the difficulty of obtaining corroborating evidence.
  • Giving effect to Guideline 3.1 re gradual awareness and Guideline 3.3 re the impact of home country suppression on testimony.
  • Giving weight to local support letters as contemplated by Guideline 7.2.3.
2.2 Approaches that give rise to concerns

Strong as many of the decisions that dealt with the Guideline were, one or more decisions also seemed inconsistent with it, for example by:

  • Failing to cite or demonstrate the application and appreciation of seemingly relevant parts of the Guideline.
  • Citing the Guideline at the start of a decision and then not demonstrating its use.
  • Applying a stereotypical assumption that a same-sex relationship would be “an authentic romantic one”.
  • Requiring the claimant to produce evidence of cultural or other barriers to address concerns about their testimony being vague.
  • Not considering the trauma of being discovered in a highly dangerous situation when discounting testimony about the incident for lack of detail (number of attackers, for e.g.).
  • Expectations that same sex partners who met for occasional encounters over a period of a few months at a hotel where the claimant worked would have contact information for each other.
  • Routinely discounting evidence of participation in Canadian LGBTQI+ organizations without adverting to the Guideline concerning its potential relevance (7.2.3), nor to the Guidelines on how it can be challenging to get corroborating evidence of sexual identity (3.2, 7.2.1 and 7.2.2).

3.0 Assessment of the Checklist

Challenges in applying the checklist

Overall, the checklist was not terribly helpful in assessing the application of the Guideline. Most of the questions in the checklist involved issues that did not arise in the decisions reviewed, and that may only infrequently arise in RAD matters. Indeed, ‘n/a’ was by a very large margin the most common response to the questions in the instrument.

Based on the decisions reviewed for this exercise, and a general sense of matters likely to arise at the RAD, only three of the 13 issues listed in the checklist would seem to be relevant to a significant number of RAD decisions, with two others having an appreciable but more limited potential application. As a whole, the checklist contained too much detail on matters not likely to arise and too little detail on matters that will frequently arise.

As well, the permitted ratings in the instrument were confined to Yes, No or N/A, while the few relevant questions were expressed in very broad terms. This suggests that aggregated ratings of the Yes and No responses to those questions are likely to have only very limited value in identifying areas where reinforcement should be offered or improvements should be sought.

This puts the real emphasis on the evaluator’s comments, which ended up having to be much more extensive than the comments made in the main checklist. This is in part because there is no rating scale, but primarily because of there are so many more, and more detailed, issues addressed in the Guideline than are reflected in the checklist. Comments were accordingly needed just to identify issues, rather than to make and explain assessments of issues raised in the checklist.

The question for the Board accordingly becomes whether a statistically meaningful analysis can be derived from what are in essence individualized narrative critiques of specific decisions. In the event that the answer turns out to be No, the balance of this section will address ways in which the Board could revise the instrument to support assessments more like those done under the main checklist.

Possible changes to the checklist

In broad terms the changes outlined below involve eliminating the questions that will rarely arise (or perhaps putting them in an ‘as-needed’ appendix) and replacing them with more detailed questions on the matters that are most likely to arise.

While it is the Directorate that has the expertise needed to design meaningful checklists, the outline set out below may serve as a useful starting point for their work if revisions are undertaken. In preparing it I have included a fair amount of particularization. This is in large measure to make it easier to assess the purpose and value of the proposed questions. If elements of what is suggested below end up being of interest to the Board, at least some of the additional detail might best be placed in an interpretative guide rather than in the checklistFootnote 13.

As was noted above, five of the existing checklist questions dealt with matters that should be included in a revised checklist. One of those asks whether sensitive information was protected. It seems suitable as is, and also seems to work fine with the Yes-No-N/A rating scale.

Another asks whether the decision-maker considered the use of laws of general application to target individuals with diverse SOGIE. A third deals with the potential lack of country documentation concerning the treatment of persons with diverse SOGIE. These both also seem fine as drafted, but if a 3-point rating scale is adopted for other items in the checklist (see below) that approach would seem appropriate for these questions as well.

The other two relevant questions address the use of stereotypes and an assessment of whether an appreciation of cultural, psychological or other barriers was shown in considering apparent inconsistencies, vagueness or material omissions. These two questions, while getting at the core of the Guideline, could usefully be expressed in significantly greater detail as well as contextualized in accordance with the Introduction to the GuidelineFootnote 14 and supplemented by certain additional questions.

My suggestions for new checklist questions are set out below. In proposing them I have assumed that there would be at least a 3-point rating scale in order to support more meaningful generalizations from the aggregated scores. I have also assumed that the assessment instrument would continue to allow for comments to elucidate or expand on the numeric ratings.

    Purposive and demonstrated application
  • 1.2 Did the RAD show a serious, sensitive and contextualized engagement with the Guideline? Did the RAD’s reasons give effect to both its spirit and its text and show an appreciation of the concerns that gave rise to the Guideline and of the specific challenges of both presenting and adjudicating these cases?
  • 2.2 Where relevant or potentially relevant sections of the Guideline were identified in a RAD decision, was their application to the decision, or the lack of application, explained?
  • Assessing of the manner in which the claimant testified
  • 3.2 In assessing the manner in which the claimant testified, did the RAD identify all relevant aspects of the Guideline and show a real appreciation of their potential application to the case? [For example, Guidelines 3.3: reticence from prior need for concealment; 3.6: psychological or mental health conditions; and 7.6: cultural, psychological or other explanations for the manner of testifying. See also Decision of Interest MB5-03341, para. 27].
  • Making findings on the claimant's credibility
  • 4.2 In assessing the credibility of the claimant, did the RAD:
    • advert to possible cultural, psychological or other explanations for: inconsistencies (7.4.1); vagueness (7.6); or material omissions (7.7)?
    • avoid stereotypes or inappropriate assumptions in assessing inconsistencies (7.4.1) or implausibility (7.5.1Footnote 15)?
    • advert to the possibility that delay in claiming may be attributed to the claimant’s fear of reprisals, reticence to tell family members what the basis of the claim will be, or reluctance to accept their own orientation or identity (8.5.11)?
    • avoid reflexively using credibility findings on other issues to discount the claimant’s evidence of their sexual orientation, gender identity or gender expression (per Decision of Interest MB5-03341, at para 31, and the authorities cited therein)?
    Making other findings
  • In making findings of fact about the claimant’s sexual orientation, gender identity or gender expression, did the RAD consider and direct itself on the relevant aspects of the Guideline? [As appropriate: gradual awareness (2.5 and 3.1); claimant’s evidence as the only possible evidenceFootnote 16 (3.2 and 7.1); specific reasons why corroboration may not be possibleFootnote 17 (7.2.1 and 7.2.2); the potential relevance of participation or non-participation in LGBTQI+ organizations (6 (last bullet) and 7.2.3Footnote 18); psychological and cultural barriers (7.4.1); stereotypes and inappropriate assumptions (7.4.1 and see 6 generally)].
  • In making findings on other matters in issue did the RAD consider and direct itself on all relevant aspects of the Guideline including, but not limited toFootnote 19 a demonstrated appreciation of the potential for stereotypes such as those set out in Guideline 6 to affect those findings?
  • Determining state protection and IFA
  • In determining issues of state protection did the RAD advert to, and as appropriate give effect to, Guidelines 3.5 and 8.6?
  • In determining issues around the existence of an IFA did the RAD advert to, and as appropriate give effect to, Guideline 8.7?

4.0 Conclusion

The Board’s decision to formally assess the application of the SOGIE Guideline so soon after its adoption is a clear demonstration of the importance the Board places on having these important and sensitive matters handled appropriately by the RAD. The above suggestions for improvements to the evaluation instrument are intended to support that laudable objective.

Overall, the proposed new questions were designed to make it easier to make meaningful aggregated conclusions about the application of the Guideline. They essentially particularize, and permit numerical ratings on, the kinds of matters that were addressed in the written comments made in this initial assessment. Their adoption would facilitate making findings and drawing conclusions in a future assessment, but would not dramatically alter the information that was obtained in this initial assessment.

5.0 Potential Application of Checklist Revisions to RPD Proceedings

The Board is about to conduct an assessment of the application of the SOGIE Guideline by the RPD. I was accordingly asked to comment on whether the changes outlined above might also apply to a checklist for assessments of RPD decisions.

My sense is that all of the above questions are appropriate for such assessments, but that the preceding critique of the existing checklist is not apt. This is because the questions that were found to not be relevant in the vast majority of RAD appellate decisions will more often, and sometimes almost always (questions 4 and 7) be relevant to assessments of the RPD.

Nonetheless, if the Board agrees that the above new questions are relevant to the assessment of RPD decisions, it may well be possible to develop a single evaluation instrument that could be used in reviews of both RPD and RAD decisions. Such an instrument could be designed to have three distinct sections, each to be completed only where relevantFootnote 20.

Section 1: All proceedings where oral hearings were held (existing questions 1, 3, 4 and 7).

Section 2: All decisions (existing questions 5, 9 and 12, plus the suggested questions above in lieu of questions 6 and 8).

Section 3: Selected proceedings where relevant (existing questions 2, 10, 11 and 13).

If it commended itself to the Board this structure would provide a universal assessment instrument that would focus the attention of the assessor on the essential quality considerations in each decision reviewed.

Concluding comment

It was a fascinating challenge, and also a pleasure, to assess RAD decisions on both the main and the SOGIE criteria, and as well to consider ways in which the existing evaluation instruments could be made more effective. I remain impressed by both the RAD’s important contribution to the vital task of achieving just outcomes in refugee matters, often accomplished under challenging circumstances, and by the Board’s ongoing commitment to assessing and improving decision quality at the RAD.

Appendix I

​One party [claimant] appeals: Interpretation of checklist
As of October 29, 2018
Outcome‘Returned with instructions’ means substantive directions beyond being heard by a different panel. May differ from disposition record on the file.
Q 1Assesses how the Member proceeds when raising a new issue from the record or where specialized knowledge is used (see Rule 24) or where a change in country conditions is relied upon. This may not apply in claimant appeals where a decision is substituted or the matter is sent back. [Note that Q [17] addresses whether a Member raises a new issue; i.e. goes to the thoroughness of the review rather than to procedural fairness].
Q 2A bare statement of the issues is treated as falling short of the standard and rated as 1, with a basic summary as 2 and a particularly helpful summary as 3. An example of a bare statement of the issues would be: “The issues in this appeal are credibility and whether there is an IFA”.
Q 3Means that the Member addresses the individual’s positions, with the scale assessing whether the thoroughness with which they are addressed matches their relevance to the outcome. The question does not assess whether the findings of the RPD are addressed.
Q 4 New: focus on determinative issue: seek most efficient path through the maze to issue, e.g. SP
Q 5Note that credibility overall is treated as a fact for the purposes of this question and Q7. Does the Member make clear findings of fact when assessing the evidentiary record and make clear findings on whether there are or are not specific errors in the RPD decision?
Q 6Note that the issue is findings of fact, not findings of error.
Q 7This will apply less frequently where the appeal is allowed.
Q 8Includes caselaw: answer in the negative when the only caselaw referred to is that concerning the RAD’s jurisdiction.
On guidelines, the issue is whether the standard is referred to and its application explained, not whether it is formally cited.
Jurisprudential Guides
-internal flight in Nigeria
-viable IFA for Punjabi cl in India (revoked 12/12/18)
-ability to exit Chinese airport with valid passport
-Whether N Korean deemed S Korean citizen
Decisions of Interest
-domestic violence through lens of G4
-family unity/best interest of child
-exclusion under Article 1E
-credibility thru lens of G9
-political sit’n in Turkey re Hizmet
-exclusion for crimes against humanity
Q 9The core issue is whether potential biases concerning socio-cultural background were adverted to and filtered out from the assessment of how the claimant’s or another witness’s evidence presented at the RPD. Can be expanded to:
-whether, if the record disclosed specific social or cultural issues concerning the events that gave rise to the claim or concerning the Appellant’s experiences or reactions, the Member adverted to those matters and addressed them;
-whether the Member addressed any cultural or social considerations that might have affected how the Appellant testified before the RPD or how their evidence was heard;
-whether the Member applied, or gave the appearance of applying, personal, Western or generational values to the assessment of events, conduct or testimony in the case.
Q 10This does not involve a formal plain language assessment. A decision will be rated 2 unless it used challenging, technical or obscure language or was written in a particularly complex structure.
-whether a decision is clearly written: does it use a reasonably simple and straightforward vocabulary; show an attempt to explain concepts and terms; and avoid long sentences, subordinate clauses and long paragraphs.
Q 11To avoid an overlap with Q11, ‘clear’ will address whether the reasoning, as opposed to the language or structure used, is clear: this question assesses whether, despite some difficult language or structure, the reasons explain the conclusion(s). To avoid an overlap with Q14, ‘conciseness’ is treated as referring to the length of the analysis of each issue as opposed to the length of the reasons overall.
Q 12Given that ‘easily understood’ and ‘logically-sequenced’ could be seen to overlap with ‘clear’ in Q12, the approach will assess whether the reasons are structured to be easily understood: i.e., easily understood because logically structured and ideally sign-posted throughout.
Q 13Given the potential overlap with Q12 re conciseness, the assessment will look at the length of the reasons overall and whether matters not necessary to support or explain the decision were dealt with.
Q 16Will also consider the application of the appropriate tests for an oral hearing if new evidence is admitted.
Q 17The assessment will look for whether the member finds a new issue in the record. [Compare to Q1]
Q 18The assessment will look for a clear articulation of the distinct issues in the appeal and for reasons that address all relevant considerations and information under each distinct issue in question---evidence, law, submissions, RPD findings and RAD conclusions.
Q 19The assessment looks for focused reasons that demonstrate and ideally explain a clear line between each issue and how it was resolved. A decision could meet Q S3 in that it provides an issue-by-issue analysis but not meet this standard.
Q 20The assessment will consider whether the reasons do more than support the result and instead explain to the Appellant how and why they lost or won, The standard is applied less strictly in cases where the Appellant won their appeal.
Q 21The assessment will consider whether the reasons provide information that is likely to assist the RPD to determine the matter, if returned, and/or to assist others in resolving similar issues once the decision is posted on CanLII.
Q 22New:
-deal with a new issue by writing to A to ask them to address it rather than sending back;
-at oral hearing be creative in expanding on credibility questions to get evidence needed to substitute a decision;
-substitute at 51%---doubt up to 49% is not a reason to send back;
-give express reasons for not substituting
Q 23The assessment will consider whether this is demonstrated by independent engagement with the evidence; by an independent analysis of credibility; by references to the recording, submissions or additional documents from the record; and by the avoidance of terms like “the RPD reasonably found’ or ‘it was open to the RPD to find’. It may also be expressly stated as the approach used.

The variations from the 2016 guide are in italics. Q 5 and Q S7 are new for 2018. The other italicized items are suggestions from the 2016 Report that were approved for use in 2018, plus added specifics on jurisprudential guides and decisions of interest.

Appendix II - Biography of Doug Ewart, 2018 Quality Assurance Evaluator

Doug is a justice policy consultant. He holds an LL.B. from Osgoode Hall Law School and an LL.M. from the London School of Economics. He was the head of the Ontario Attorney General’s Policy Development Division for thirteen years, having previously combined policy work for the Ministry with an extensive criminal law appellate practice before the Court of Appeal for Ontario and the Supreme Court of Canada. He has published three legal texts.

He also spent some thirteen years on an Executive Interchange to the Government of Canada. There, his responsibilities included acting as Senior General Counsel and Senior Advisor to the Deputy Minister of Justice, and then as Senior Advisor to Deputy Ministers at the Privy Council Office and the Department of Indian Residential Schools Resolutions Canada.

In the last-noted role he was credited with being the principal architect of, and played an extensive role in implementing, an adjudication process for individual claims of sexual and physical abuse of aboriginal children who attended residential schools. It has been successfully used to resolve some 40,000 of those sensitive and complex claims.

His administrative justice experience also includes serving as the Executive Lead for the transformation of the Human Rights Tribunal of Ontario into a direct access, active-adjudication body. This involved the design of an entirely new process for receiving and determining human rights claims and then serving as the operational lead for the start-up phase of the new tribunal.

As well, he has served as Special Advisor to the Executive Chair of Ontario’s first cluster of adjudicative tribunals, working to develop and implement this new approach to tribunal efficiency and effectiveness for five environment and lands tribunals. He then went on to participate in developing the policy framework for the clustering of seven social justice tribunals.

His other experience includes acting as the policy lead for the 2008 Review of the Roots of Youth Violence established by the Premier of Ontario, and being the lead drafter of its report.

Notes

Note 1

12 members having less than one year of experience were excluded from the study.

Return to note 1 referrer

Note 2

It is acknowledged that the study may be following a narrow interpretation of the jurisprudence. There are a number of cases that say that further findings made by the RAD on an issue that the appellant was aware of or should have been aware of are not new issues and do not require notice (see: Adeoye v. M.C.I. 2018 FC 245; Marin v. M.C.I., 2018 FC 243; Caleb v. Canada (Citizenship and Immigration), 2018 FC 384; Emac Sonkoue v. MCI, 2018 FC 1173; Boluwaji v. MCI, 2018 FC 1154; He v. MCI, 2018 FC 627; Bebri v. MCI, 2018 FC 726; Akram v. MCI, 2018 FC 785; Jiang v. MCI, 2018 FC 1064)

Return to note 2 referrer

Note 3

See page 40 of this report.

Return to note 3 referrer

Note 4

While the preparation of the Report involved a certain amount of legal research and analysis, nothing in the Report is intended as, nor should be read as, a legal opinion or legal advice.

Return to note 4 referrer

Note 5

Two of the checklist items are automatically calculated and do not figure into this Report.

Return to note 5 referrer

Note 6

The new questions are numbers 5 and S7. The variations in wording are found in 2018 questions 3, 4, 14 and S3. Given the original purpose of question S3, and the addition of the new question 5, the Board may wish to remove the new reference to ‘determinative’ issues in question S3 if it does not proceed with the restructuring of certain questions proposed later in this Report.

Return to note 6 referrer

Note 7

The interpretative guide used in the 2018 assessments is set out in Appendix I, with modifications from the 2016 version shown in italics.

Return to note 7 referrer

Note 8

See in this connection the brief comment at page 14 on the potential value of individualized assessments.

Return to note 8 referrer

Note 9

As of January 2019 the Directorate advises that it is looking at options to make greater use of the written comments.

Return to note 9 referrer

Note 10

These fairly basic problems have persisted since 2016 [ ].

Return to note 10 referrer

Note 11

See [ ]

Return to note 11 referrer

Note 12

[ ].

Return to note 12 referrer

Note 13

That said, it may be that conducting one or more assessment exercises with most of the detail in the checklist itself could support a further refinement of the checklist based on the frequency with which particularized issues arose.

Return to note 13 referrer

Note 14

Four of the themes set out in the Introduction seem core to the evaluation of RAD decisions: challenges in presenting evidence; avoiding stereotyping and inappropriate assumptions in fact finding; assessing credibility; and increasing awareness of unique circumstances that may affect findings of fact or of mixed fact and law (see Guideline 1.4, points I, iv, v and vi). While these themes do not translate directly into the structure for an evaluation checklist, they have informed the approach set out in this section of the Report.

Return to note 14 referrer

Note 15

See also Decision of Interest MB5-03341 at para. 29.

Return to note 15 referrer

Note 16

The core issue here is whether the RAD gave real effect to the Guideline’s appreciation that the Appellant’s testimony might be the only evidence before discounting the Appellant’s stand-alone evidence or finding that by itself it was not sufficient to prove their orientation, identity or expression. Applying the Guideline might also mean that increased weight should be given to even a modicum of supporting evidence.

Return to note 16 referrer

Note 17

While this may be an issue for training rather than the checklist, it is worth noting the difference between a finding that testimony is not compelling enough to be credible without corroboration, and a finding that testimony is not credible just because there is no corroboration.

Return to note 17 referrer

Note 18

In particular this would involve avoiding unexplained discounting of corroboration based on active involvement with LGBTQI+ organizations in Canada.

Return to note 18 referrer

Note 19

For example, discounting corroboration on the basis that it would be unreasonable for the individual to subject themselves to risk by supporting the Appellant in their home country or by offering supportive evidence for their refugee claim.

Return to note 19 referrer

Note 20

The use of clickable tabs, as in the existing electronic version of the checklist, would facilitate this.

Return to note 20 referrer