Stream(스트림) - 6. collect() (최종연산 2/3)

Java 2020. 3. 11. 21:43

Collect()

최종 연산 중 가장 복잡하면서도 유용하게 활용될 수 있음

스트림의 요소를 수집하는 최종 연산으로 reducing(리듀싱)과 유사함

매개변수의 타입이 Collector : Collector를 구현한 클래스의 객체가 매개변수

- 해당 객체가 구현된 방법대로 스트림의 요소를 수집

sort()시 Comparator가 필요한 것 처럼, collect()시 Collector가 필요

Object collect(Collector collector) // Collector를 구현한 클래스의 객체를 매개변수로

/* 잘 사용되지는 않지만, Collector 인터페이스를 구현하지 않고 간단히 람다식으로 수집할 때 사용하면 편리 */
Object collect(Supplier supplier, BiConsumer accumulator, BiConsumer combiner)

컬렉터(collector)

스트림의 요소를 수집하려면, 어떻게 수집할 것인가에 대한 방법이 정의되어 있어야 함. 해당 방법이 컬렉터

Collector 인터페이스를 구현한 것으로, 직접 구현할 수도 있고 미리 작성된 것을 사용할 수도 있음

Collectors Class

미리 작성된 다양한 종류의 컬렉터를 반환하는 static 메서드를 가지고 있음

해당 클래스를 통해 제공되는 컬렉터만으로도 많은 일을 할 수 있음

간단 정리

Collect() : 스트림의 최종 연산, 매개변수로 컬렉터가 필요

Collector : 인터페이스, 컬렉터는 해당 인터페이스를 구현해야 함

Collectors : 클래스, static 메서드로 미리 작성된 컬렉터를 제공

스트림의 컬렉션/배열 변환 : toList(), toSet(), toMap(), toCollection, toArray()

스트림의 모든 요소를 컬렉션에 수집시, Collectors 클래스의 toList()와 같은 메서드를 사용하면 됨

List나 Set이 아닌 특정 컬렉션을 지정하려면, toCollection()에 해당 컬렉션의 생성자 참조를 매개변수로 넣어주면 됨

List<String> names = studentStream.map(Student::getName)
                                    .collect(Collectors.toList());

ArrayList<String> list = names.stream()
                   .collect(Collectors.toCollection(ArrayList::new));

Map의 경우, Key와 Value의 쌍으로 지정해야 하므로, 객체의 어떤 필드를 Key로, Value로 사용할지 지정해야 함

/* Person의 Stream에서 Person의 ID가 Key, Person이 Value */
Map<String, Person> map = personStream
                     .collect(Collectors.toMap(p -> p.getId(), p -> p));
                     
/* 항등 함수를 의미하는 람다식 p -> p대신 Function.identity()를 사용할 수 있음 */
Map<String, Person> map = personStream
                     .collect(Collectors.toMap(p -> p.getId(), Function.identity));

저장된 요소들을 'T[]'타입의 배열로 변환시, toArray() 사용 - 단, 해당 타입의 생성자 참조를 매개 변수로 지정해줘야 함

- 매개변수 미지정시 Object[] 반환

Student[] studentNames = studentStream.toArray(Student::new); // OK
Student[] studentNames = studentStream.toArray(); // ERROR
Object[]  studentNames = studentStream.toArray(); // OK

통계 : counting(), summingInt(), summingLong(), summingDouble(), averagingInt(), averageLong(), averageDouble(), maxBy(), minBy()

최종 연산들이 제공하는 통계 정보를 collect()로 동일하게 얻을 수 있음

동일한데 필요한 이유 - groupingBy()와 함께 사용시 필요

summingInt()와 summarizingInt()를 혼동하지 않도록 주의할 것

// 요소 개수 반환
long count = studentStream.count();
long count = studentStream.collect(Collectors.counting());

// 합
long totalScore = studentStream.mapToInt(Student::getTotalScore).sum();
long totalScore = studentStream.collect(Collectors.summingInt(Student::getTotalScore));

// 최대값
OptionalInt topScore = studentStream.mapToInt(Student::getTotalScore)
              .max();
Optional<Student> topStudent = studentStream
              .max(Comparator.comparingInt(Studnet::getTotalScore));
Optional<Student> topStudent = studentStream
              .collect(Collectors.maxBy(Comparator.comparingInt(Student::getTotalScore)));

// IntSummaryStatistics 생성하기 - sum() average() 모두 호출 등 다양한 메서드 사용할 때
IntSummaryStatistics stat = studentStream
              .mapToInt(Student::getTotalScore).summaryStatistics();
IntSummaryStatistics stat = studentStream
              .collect(Collectors.summarizingInt(Studnet::getTotalSocre));

리듀싱 : reducing()

IntStream에는 매개변수 3개짜리 collect()만 정의되어 있으므로 boxed()를 통해 IntStream을 Stream<Integer>로 변환해야 매개변수 1개짜리 collect()를 사용할 수 있음

public interface IntStream extends BaseStream<Integer, IntStream> {
    ...
    <R> R collect(Supplier<R> supplier,
                      ObjIntConsumer<R> accumulator,
                      BiConsumer<R, R> combiner);
    ...
}

public interface Stream<T> extends BaseStream<T, Stream<T>> {
    ...
    <R, A> R collect(Collector<? super T, A, R> collector);
    ...
    <R> R collect(Supplier<R> supplier,
                  BiConsumer<R, ? super T> accumulator,
                  BiConsumer<R, R> combiner);
}

IntStream intStream = new Random().ints(1, 46).distinct().limit(6);

OptionalInt       max = intStream.reduce(Integer::max);
Optional<Integer> max = intStream.boxed().collect(Collectors.reducing(Integer::max));

long sum = intStream.reduce(0, (a, b) -> a + b);
long sum = intStream.boxed().collect(Collectors.reducing(0, (a, b) -> a + b));

int grandTotal = studentStream.map(Student::getTotalScore).reduce(0, Integer::sum);
int grandTotal = studentStream.collect(Collectors.reducing(0, Student::getTotalScore, Integer::sum));

Collectors.reducing() 위에 나온 3가지 (와일드 카드 제거하여 간략히 함)

- 세 번째 메서드 제외하고 reduce()와 같음, 세 번째는 map()과 reduce()를 하나로 합쳐놓은 것

Collector reducing(BinaryOperator<T> op)
Collector reducing(T identity, BinaryOperator<T> op)
Collector reducing(U identity, Function<T, U> mapper, BinaryOperator<U> op

문자열 결합 : joining()

문자열 스트림의 모든 요소를 하나의 문자열로 연결해서 반환

구분자를 지정해줄 수도 있고, 접두사와 접미사도 지정 가능

스트림의 요소가 String이나 StringBuffer처럼 CharSequence의 자손인 경우에만 결합이 가능

- 스트림의 요소가 문자열이 아닐 경우 map()을 이용하여 스트림의 요소를 문자열로 변환

- 문자열이 아닌 경우, 스트림의 요소에 toString()을 호출한 결과를 결합

String studentNames = studentStream.map(Student::getName)
                                   .collect(Collectors.joining());
String studentNames = studentStream.map(Student::getName)
                                   .collect(Collectors.joining(","));
String studentNames = studentStream.map(Student::getName)
                                   .collect(Collectors.joining(",", "[", "]"));
                                   
// toString()으로 결합
String studentInfo = studentStream.collect(Collectors.joining(", "));

그룹화/분할 : groupingBy(), partitioningBy()

그룹화 : 스트림의 요소를 특정 기준으로 그룹화하는 것 - 스트림의 요소를 Function으로 분류

분할 : 스트림의 요소를 두 가지, 지정된 조건에 일치하는 그룹과 일치하지 않는 그룹으로 분할 - 스트림의 요소를 Predicate로 분류

Collector groupingBy(Function classifier)
Collector groupingBy(Function classifier, Collector downstream)
Collector groupingBy(Function classifier, Supplier mapFactory, Collector downstream)

Collector partitioningBy(Predicate predicate)
Collector partitioningBy(Predicate predicate, Colelctor downstream)

분류를 Function으로 하느냐 Predicate로 하느냐의 차이만 있을 뿐 동일

스트림을 두 개의 그룹으로 나눠야 한다면, partitioningBy()로 분할하는 것이 더 빠름 - 그 외에는 groupingBy

그룹화와 분할의 결과는 Map에 담겨 반환함

그룹화/분할에 사용할 예제 - Student Class

class Student {

    String name;    // 이름
    boolean isMale; // 성별
    int grade;      // 학년
    int group;      // 그룹
    int score;      // 점수

    Student(String name, boolean isMale, int grade, int group, int score) {
        this.name = name;
        this.isMale = isMale;
        this.grade = grade;
        this.group = group;
        this.score = score;
    }

    String getName() {
        return name;
    }

    boolean isMale() {
        return isMale;
    }

    int getGrade() {
        return grade;
    }

    int getGroup() {
        return group;
    }

    int getScore() {
        return score;
    }

    public String toString() {
        return String.format("[%s, %s, %d학년, %d그룹, %3d점]",
            name, isMale ? "남" : "여", grade, group, score);
    }

    enum Level {HIGH, MIDDLE, LOW}
}

그룹화/분할에 사용할 예제 - studentStream Stream

Stream<Student> stuStream = Stream.of(
    new Student("가나다", true,  1, 1, 100);
    new Student("나다라", true,  1, 2, 200);
    new Student("라마바", false, 1, 3, 300);
    new Student("사아자", false, 1, 4, 200);
    new Student("아자차", true,  1, 5, 100);
    new Student("차카타", true,  1, 6, 200);
    new Student("파하가", true,  1, 7, 150);
    
    new Student("나다라", false, 2, 1, 240);
    new Student("마바사", false, 2, 2, 250);
    new Student("아자차", true,  2, 3, 300);
    new Student("카타파", true,  2, 4, 200);
    new Student("하가나", true,  2, 5, 200);
    new Student("다마바", false, 2, 6, 100);
    new Student("차카타", false, 2, 7, 150);
);

partitioningBy()에 의한 분류

기본 분할(true / false)

성별로 나누어 List로 담기

// 1. 기본 분할
Map<Boolean, List<Student>> studentBySex = Stream.of(studentArr)
            .collect(Collectors.partitioningBy(Student::isMale)); // 학우들을 성별로 분류

List<Student> maleStudent   = studentBySex.get(true);   // Map에서 남학우 목록 얻기
List<Student> femaleStudent = studentBySex.get(false);  // Map에서 여학우 목록 얻기

기본 분할(true / false) & 통계 정보

Collectors.counting() : 수 구하기

// 기본 분할 + 통계 정보
Map<Boolean, Long> studentNumberBySex = Stream.of(studentArr)
.collect(Collectors.partitioningBy(Student::isMale, Collectors.counting()));

long countMale   = studentNumberBySex.get(true);  // 8 : 남학우 수
long countFemale = studentNumberBySex.get(false); // 6 : 여학우 수

Collectors.summingLong() : 총점 구하기

Map<Boolean, Long> scoreSumBySex = Stream.of(studentArr)
    .collect(Collectors
       .partitioningBy(Student::isMale, Collectors.summingLong(student::getScore())));
long scoreSumMale = scoreSumBySex.get(true);    - 1450 : 남학우 점수 총점
long scoreSumFemale = scoreSumBySex.get(false); - 1240 : 여학우 점수 총점

Collectors.maxBy() : 1등 구하기 -> maxBy()는 반환타입이 Optional<T>

Map<Boolean, Optional<Student>> topScoreBySex = Stream.of(studentArr)
    .collect(Collectors.partitioningBy(Student::isMale,
       Collectors.maxBy(Comparator.comparingInt(Student::getScore))));
       
Optional<Student> topScoreMale   = topScoreBySex.get(true);
Optional<Student> topScoreFemale = topScoreBySex.get(false);

String topScoreMaleString   = topScoreMale.toString();   // Optional[[아자차, 남, 2학년, 3그룹, 300점]]
String topScoreFemaleString = topScoreFemale.toString(); // Optional[[라마바, 여, 1학년, 3그룹, 300점]]

1등 구하기 - Collectors.collectingAndThen + Optional::get을 결합하면 Student로 반환타입 받을 수 있음

Map<Boolean, Student> topScoreBySexReturnStudent = Stream.of(studentArr)
    .collect(Collectors.partitioningBy(Student::isMale,
       Collectors.collectingAndThen(
           Collectors.maxBy(Comparator.comparingInt(Student::getScore)), Optional::get)));

Student topScoreMaleStudent   = topScoreBySexReturnStudent.get(true);
Student topScoreFemaleStudent = topScoreBySexReturnStudent.get(false);

String topScoreMaleStudentString   = topScoreMaleStudent.toString();   // [아자차, 남, 2학년, 3그룹, 300점]
String topScoreFemaleStudentString = topScoreFemaleStudent.toString(); // [라마바, 여, 1학년, 3그룹, 300점]

이중 분할 (기본분할(true / false) + 통계 분할)

partitioningBy()를 중첩하여 이중 분할을 할 수도 있음

Map<Boolean, Map<Boolean, List<Student>>> failedStudentBySex = Stream.of(studentArr)
    .collect(Collectors.partitioningBy(Student::isMale,
        Collectors.partitioningBy(student -> student.getScore() <= 100)));

List<Student> failedMaleStudents   = failedStudentBySex.get(true).get(true);
List<Student> failedFemaleStudents = failedStudentBySex.get(false).get(true);

for (Student student : failedMaleStudents) {
    System.out.println(student);
}
// [가나다, 남, 1학년, 1그룹, 100점]
// [아자차, 남, 1학년, 5그룹, 100점]
 
for (Student student : failedFemaleStudents) {
    System.out.println(student);
}
// [다마바, 여, 2학년, 6그룹, 100점]

groupingBy()에 의한 분류

그룹 별 그룹 - toList() : 기본

생략시 기본적으로 toList()로 생성됨

Map<Integer, List<Student>> studentByGroupList = Stream.of(studentArr)
    .collect(Collectors.groupingBy(Student::getGroup, Collectors.toList()));
    
// toList() 생략 가능 - 기본값임
Map<Integer, List<Student>> studentByGroupList = Stream.of(studentArr)
    .collect(Collectors.groupingBy(Student::getGroup));
    
for (List<Student> group : studentByGroupList.values()) {
    System.out.println();
    for (Student student : group) {
        System.out.println(student);
    }
}

[가나다, 남, 1학년, 1그룹, 100점]
[나다라, 여, 2학년, 1그룹, 240점]

[나다라, 남, 1학년, 2그룹, 200점]
[마바사, 여, 2학년, 2그룹, 250점]

[라마바, 여, 1학년, 3그룹, 300점]
[아자차, 남, 2학년, 3그룹, 300점]

[사아자, 여, 1학년, 4그룹, 200점]
[카타파, 남, 2학년, 4그룹, 200점]

[아자차, 남, 1학년, 5그룹, 100점]
[하가나, 남, 2학년, 5그룹, 200점]

[차카타, 남, 1학년, 6그룹, 200점]
[다마바, 여, 2학년, 6그룹, 100점]

[파하가, 남, 1학년, 7그룹, 150점]
[차카타, 여, 2학년, 7그룹, 150점]

그룹 별 그룹 - toSet(), toCollection(생성자)

기본 타입(toList)이 아닌 경우, 반환 타입 제네릭 지정에 유의할 것

toCollection()은 안에 생성자 입력 필요

Map<Integer, Set<Student>> studentByGroupSet = Stream.of(studentArr)
    .collect(Collectors.groupingBy(Student::getGroup, Collectors.toSet()));
        
Map<Integer, HashSet<Student>> studentByGroupHashSet = Stream.of(studentArr)
    .collect(Collectors.groupingBy(Student::getGroup, Collectors.toCollection(HashSet::new)));

성적 등급 그룹화(Enum 이용)

Map<Student.Level, List<Student>> studentByLevel = Stream.of(studentArr)
    .collect(Collectors.groupingBy(student -> {
             if(student.getScore() >= 250) return Student.Level.HIGH;
        else if(student.getScore() >= 150) return Student.Level.MIDDLE;
        else                               return Student.Level.LOW;
    }));

Set<Level> keySet = new TreeSet<>(studentByLevel.keySet());

for(Student.Level key : keySet) {
    System.out.println("[" + key + "]");

    for(Student student : studentByLevel.get(key)) {
        System.out.println(student);
    }
    System.out.println();
}

[HIGH]
[라마바, 여, 1학년, 3그룹, 300점]
[마바사, 여, 2학년, 2그룹, 250점]
[아자차, 남, 2학년, 3그룹, 300점]

[MIDDLE]
[나다라, 남, 1학년, 2그룹, 200점]
[사아자, 여, 1학년, 4그룹, 200점]
[차카타, 남, 1학년, 6그룹, 200점]
[파하가, 남, 1학년, 7그룹, 150점]
[나다라, 여, 2학년, 1그룹, 240점]
[카타파, 남, 2학년, 4그룹, 200점]
[하가나, 남, 2학년, 5그룹, 200점]
[차카타, 여, 2학년, 7그룹, 150점]

[LOW]
[가나다, 남, 1학년, 1그룹, 100점]
[아자차, 남, 1학년, 5그룹, 100점]
[다마바, 여, 2학년, 6그룹, 100점]

성적 등급 그룹화(Enum 이용) + 통계

Map<Student.Level, Long> studentCountByLevel = Stream.of(studentArr)
    .collect(Collectors.groupingBy(student -> {
             if(student.getScore() >= 250) return Student.Level.HIGH;
        else if(student.getScore() >= 150) return Student.Level.MIDDLE;
        else                               return Student.Level.LOW;
    }, Collectors.counting()));

for(Student.Level key : studentCountByLevel.keySet()) {
    System.out.printf("[%s] - %d명, ", key, studentCountByLevel.get(key));
}

[HIGH] - 3명, [LOW] - 3명, [MIDDLE] - 8명,

다수준 그룹화 가능

partitioningBy()처럼 다수준 그룹화가 가능

학년별로 그룹화한 후 다시 그룹별로 그룹화하기

Map<Integer, Map<Integer, List<Student>>> studentByGradleAndGroup = Stream.of(studentArr)
    .collect(
        Collectors.groupingBy(Student::getGrade, Collectors.groupingBy(Student::getGroup)));

for(Map<Integer, List<Student>> gradle : studentByGradleAndGroup.values()) {
    for(List<Student> group : gradle.values()) {
        System.out.println();
        for(Student student : group) {
            System.out.println(student);
        }
    }
}

[가나다, 남, 1학년, 1그룹, 100점]

[나다라, 남, 1학년, 2그룹, 200점]

[라마바, 여, 1학년, 3그룹, 300점]

[사아자, 여, 1학년, 4그룹, 200점]

[아자차, 남, 1학년, 5그룹, 100점]

[차카타, 남, 1학년, 6그룹, 200점]

[파하가, 남, 1학년, 7그룹, 150점]

[나다라, 여, 2학년, 1그룹, 240점]

[마바사, 여, 2학년, 2그룹, 250점]

[아자차, 남, 2학년, 3그룹, 300점]

[카타파, 남, 2학년, 4그룹, 200점]

[하가나, 남, 2학년, 5그룹, 200점]

[다마바, 여, 2학년, 6그룹, 100점]

[차카타, 여, 2학년, 7그룹, 150점]

다수준 그룹화 응용 1

학년별로 그룹화한 후 다시 그룹별로 그룹화하고, 각 그룹의 1등만 출력

collectingAndThen()과 maxBy()를 추가 사용

Map<Integer, Map<Integer, Student>> topStudentByGradleAndGroup = Stream.of(studentArr)
    .collect(
        Collectors.groupingBy(Student::getGrade,
            Collectors.groupingBy(Student::getGroup,
                Collectors.collectingAndThen(
                    Collectors.maxBy(Comparator.comparingInt(Student::getScore)),
                    Optional::get))));

for(Map<Integer, Student> group : topStudentByGradleAndGroup.values()) {
    for(Student student : group.values()) {
        System.out.println(student);
    }
}

[차카타, 여, 2학년, 7그룹, 150점]
[가나다, 남, 1학년, 1그룹, 100점]
[나다라, 남, 1학년, 2그룹, 200점]
[라마바, 여, 1학년, 3그룹, 300점]
[사아자, 여, 1학년, 4그룹, 200점]
[아자차, 남, 1학년, 5그룹, 100점]
[차카타, 남, 1학년, 6그룹, 200점]
[파하가, 남, 1학년, 7그룹, 150점]
[나다라, 여, 2학년, 1그룹, 240점]
[마바사, 여, 2학년, 2그룹, 250점]
[아자차, 남, 2학년, 3그룹, 300점]
[카타파, 남, 2학년, 4그룹, 200점]
[하가나, 남, 2학년, 5그룹, 200점]
[다마바, 여, 2학년, 6그룹, 100점]
[차카타, 여, 2학년, 7그룹, 150점]

다수준 그룹화 응용 2

학년별 + 그룹별 그룹화한 후, 성적 그룹으로 변환(mapping)하여 Set에 저장

Map<String, Set<Student.Level>> studentByScoreGroup = Stream.of(studentArr)
    .collect(Collectors
        .groupingBy(student -> student.getGrade() + "-" + student.getGroup(),
            Collectors.mapping(student -> {
                     if(student.getScore() >= 250) return Student.Level.HIGH;
                else if(student.getScore() >= 150) return Student.Level.MIDDLE;
                else                               return Student.Level.LOW;
            }, Collectors.toSet())));

Set<String> keySet2 = studentByScoreGroup.keySet();

for(String key : keySet2) {
    System.out.println("[" + key + "]" + studentByScoreGroup.get(key));
}
[1-1][LOW]
[2-1][MIDDLE]
[1-2][MIDDLE]
[2-2][HIGH]
[1-3][HIGH]
[2-3][HIGH]
[1-4][MIDDLE]
[2-4][MIDDLE]
[1-5][LOW]
[2-5][MIDDLE]
[1-6][MIDDLE]
[2-6][LOW]
[1-7][MIDDLE]
[2-7][MIDDLE]

// 여러개였다면 [2-7][MIDDLE, HIGH] 이런식으로

저작자표시 비영리 변경금지

'Java' 카테고리의 다른 글

Stream(스트림) - 8. 스트림의 변환 (0)	2020.03.11
Stream(스트림) - 7. Collector 구현 (최종연산 3/3) (0)	2020.03.11
Stream(스트림) - 5. 최종 연산 1/3 (0)	2020.03.10
Stream(스트림) - 4. Optional<T> & OptionalInt (0)	2020.03.10
Stream(스트림) - 3. 스트림 중간 연산 (0)	2020.03.09

ABOUT ME

Jamie의 성장기 Jamie의 성장기

Collect()

컬렉터(collector)

Collectors Class

간단 정리

스트림의 컬렉션/배열 변환 : toList(), toSet(), toMap(), toCollection, toArray()

통계 : counting(), summingInt(), summingLong(), summingDouble(), averagingInt(), averageLong(), averageDouble(), maxBy(), minBy()

리듀싱 : reducing()

문자열 결합 : joining()

그룹화/분할 : groupingBy(), partitioningBy()

partitioningBy()에 의한 분류

기본 분할(true / false)

기본 분할(true / false) & 통계 정보

이중 분할 (기본분할(true / false) + 통계 분할)

groupingBy()에 의한 분류

그룹 별 그룹 - toList() : 기본

그룹 별 그룹 - toSet(), toCollection(생성자)

성적 등급 그룹화(Enum 이용)

성적 등급 그룹화(Enum 이용) + 통계

다수준 그룹화 가능

다수준 그룹화 응용 1

다수준 그룹화 응용 2

'Java' 카테고리의 다른 글

티스토리툴바

ABOUT ME

Collect()

컬렉터(collector)

Collectors Class

간단 정리

스트림의 컬렉션/배열 변환 : toList(), toSet(), toMap(), toCollection, toArray()

통계 : counting(), summingInt(), summingLong(), summingDouble(), averagingInt(), averageLong(), averageDouble(), maxBy(), minBy()

리듀싱 : reducing()

문자열 결합 : joining()

그룹화/분할 : groupingBy(), partitioningBy()

partitioningBy()에 의한 분류

기본 분할(true / false)

기본 분할(true / false) & 통계 정보

이중 분할 (기본분할(true / false) + 통계 분할)

groupingBy()에 의한 분류

그룹 별 그룹 - toList() : 기본

그룹 별 그룹 - toSet(), toCollection(생성자)

성적 등급 그룹화(Enum 이용)

성적 등급 그룹화(Enum 이용) + 통계

다수준 그룹화 가능

다수준 그룹화 응용 1

다수준 그룹화 응용 2

'Java' 카테고리의 다른 글

관련글 관련글 더보기

티스토리툴바